7 research outputs found

    Stock market prediction using machine learning classifiers and social media, news

    Get PDF
    Accurate stock market prediction is of great interest to investors; however, stock markets are driven by volatile factors such as microblogs and news that make it hard to predict stock market index based on merely the historical data. The enormous stock market volatility emphasizes the need to effectively assess the role of external factors in stock prediction. Stock markets can be predicted using machine learning algorithms on information contained in social media and financial news, as this data can change investors’ behavior. In this paper, we use algorithms on social media and financial news data to discover the impact of this data on stock market prediction accuracy for ten subsequent days. For improving performance and quality of predictions, feature selection and spam tweets reduction are performed on the data sets. Moreover, we perform experiments to find such stock markets that are difficult to predict and those that are more influenced by social media and financial news. We compare results of different algorithms to find a consistent classifier. Finally, for achieving maximum prediction accuracy, deep learning is used and some classifiers are ensembled. Our experimental results show that highest prediction accuracies of 80.53% and 75.16% are achieved using social media and financial news, respectively. We also show that New York and Red Hat stock markets are hard to predict, New York and IBM stocks are more influenced by social media, while London and Microsoft stocks by financial news. Random forest classifier is found to be consistent and highest accuracy of 83.22% is achieved by its ensemble

    Review of Big Data Analytics, Artificial Intelligence and Nature-Inspired Computing Models towards Accurate Detection of COVID-19 Pandemic Cases and Contact Tracing

    Get PDF
    ArticleThe emergence of the 2019 novel coronavirus (COVID-19) which was declared a pandemic has spread to 210 countries worldwide. It has had a significant impact on health systems and economic, educational and social facets of contemporary society. As the rate of transmission increases, various collaborative approaches among stakeholders to develop innovative means of screening, detecting and diagnosing COVID-19’s cases among human beings at a commensurate rate have evolved. Further, the utility of computing models associated with the fourth industrial revolution technologies in achieving the desired feat has been highlighted. However, there is a gap in terms of the accuracy of detection and prediction of COVID-19 cases and tracing contacts of infected persons. This paper presents a review of computing models that can be adopted to enhance the performance of detecting and predicting the COVID-19 pandemic cases. We focus on big data, artificial intelligence (AI) and nature-inspired computing (NIC) models that can be adopted in the current pandemic. The review suggested that artificial intelligence models have been used for the case detection of COVID-19. Similarly, big data platforms have also been applied for tracing contacts. However, the nature-inspired computing (NIC) models that have demonstrated good performance in feature selection of medical issues are yet to be explored for case detection and tracing of contacts in the current COVID-19 pandemic. This study holds salient implications for practitioners and researchers alike as it elucidates the potentials of NIC in the accurate detection of pandemic cases and optimized contact tracing

    Digital technologies catalyzing business model innovation in supply chain management - the case of parcel lockers as a solution for improving sustainable city mobility

    Get PDF
    The rise of information technologies pushes companies into digital restructuring. Organizations integrating emerging technologies into their supply chains can boost efficiency by streamlining processes and making more informed decisions using predictive analytics. This research dis-cusses major enablers for digital transformation and presents the application of those along different parts of a digital supply chain, while focusing on technical characteristics, implementations, and impact on organizational capabilities and strategies. The parcel lockers are a technology that sustains and improves last-mile delivery. By combining it with night-time delivery improves the City's Sustainable Mobility and, therefore, reduces the local emissions and city congestion

    Mineração em Grandes Massas de Dados Utilizando Hadoop MapReduce e Algoritmos Bio-inspirados: Uma Revisão Sistemática

    Get PDF
    A Área de Mineração de Dados tem sido utilizada em diversas áreasde aplicação e visa extrair conhecimento através da análise de dados. Nas últimasdécadas, inúmeras bases de dados estão tendenciando a possuir grande volume, altavelocidade de crescimento e grande variedade. Esse fenômeno é conhecido como BigData e corresponde a novos desafios para tecnologias clássicas como Sistema de Gestãode Banco de Dados Relacional pois não tem oferecido desempenho satisfatórioe escalabilidade para aplicações do tipo Big Data. Ao contrário dessas tecnologias,Hadoop MapReduce é um framework que, além de provêr processamento paralelo,também fornece tolerância a falhas e fácil escalabilidade sobre um sistema de armazenamentodistribuído compatível com cenário Big Data. Uma das técnicas que vemsendo utilizada no contexto Big Data são algoritmos bio-inspirados. Esses algoritmossão boas opções de solução em problemas complexos multidimensionais, multiobjetivose de grande escala. A combinação de sistemas baseados em Hadoop MapReducee algoritmos bio-inspirados tem se mostrado vantajoso em aplicações Big Data. Esseartigo apresenta uma revisão sistemática de trabalhos nesse contexto, visando analisarcritérios como: tarefas de mineração de dados abordadas, algoritmos bio-inspiradosutilizados, disponibilidade das bases utilizadas e quais características Big Data sãotratadas nos trabalhos. Como resultado, esse artigo discute os critérios analisados eidentifica alguns modelos de paralelização, além de sugerir uma direção para trabalhosfuturos

    Big Data Optimization : Algorithmic Framework for Data Analysis Guided by Semantics

    Get PDF
    Fecha de Lectura de Tesis: 9 noviembre 2018.Over the past decade the rapid rise of creating data in all domains of knowledge such as traffic, medicine, social network, industry, etc., has highlighted the need for enhancing the process of analyzing large data volumes, in order to be able to manage them with more easiness and in addition, discover new relationships which are hidden in them Optimization problems, which are commonly found in current industry, are not unrelated to this trend, therefore Multi-Objective Optimization Algorithms (MOA) should bear in mind this new scenario. This means that, MOAs have to deal with problems, which have either various data sources (typically streaming) of huge amount of data. Indeed these features, in particular, are found in Dynamic Multi-Objective Problems (DMOPs), which are related to Big Data optimization problems. Mostly with regards to velocity and variability. When dealing with DMOPs, whenever there exist changes in the environment that affect the solutions of the problem (i.e., the Pareto set, the Pareto front, or both), therefore in the fitness landscape, the optimization algorithm must react to adapt the search to the new features of the problem. Big Data analytics are long and complex processes therefore, with the aim of simplify them, a series of steps are carried out through. A typical analysis is composed of data collection, data manipulation, data analysis and finally result visualization. In the process of creating a Big Data workflow the analyst should bear in mind the semantics involving the problem domain knowledge and its data. Ontology is the standard way for describing the knowledge about a domain. As a global target of this PhD Thesis, we are interested in investigating the use of the semantic in the process of Big Data analysis, not only focused on machine learning analysis, but also in optimization

    Bio-inspired optimization in integrated river basin management

    Get PDF
    Water resources worldwide are facing severe challenges in terms of quality and quantity. It is essential to conserve, manage, and optimize water resources and their quality through integrated water resources management (IWRM). IWRM is an interdisciplinary field that works on multiple levels to maximize the socio-economic and ecological benefits of water resources. Since this is directly influenced by the river’s ecological health, the point of interest should start at the basin-level. The main objective of this study is to evaluate the application of bio-inspired optimization techniques in integrated river basin management (IRBM). This study demonstrates the application of versatile, flexible and yet simple metaheuristic bio-inspired algorithms in IRBM. In a novel approach, bio-inspired optimization algorithms Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) are used to spatially distribute mitigation measures within a basin to reduce long-term annual mean total nitrogen (TN) concentration at the outlet of the basin. The Upper Fuhse river basin developed in the hydrological model, Hydrological Predictions for the Environment (HYPE), is used as a case study. ACO and PSO are coupled with the HYPE model to distribute a set of measures and compute the resulting TN reduction. The algorithms spatially distribute nine crop and subbasin-level mitigation measures under four categories. Both algorithms can successfully yield a discrete combination of measures to reduce long-term annual mean TN concentration. They achieved an 18.65% reduction, and their performance was on par with each other. This study has established the applicability of these bio-inspired optimization algorithms in successfully distributing the TN mitigation measures within the river basin. Stakeholder involvement is a crucial aspect of IRBM. It ensures that researchers and policymakers are aware of the ground reality through large amounts of information collected from the stakeholder. Including stakeholders in policy planning and decision-making legitimizes the decisions and eases their implementation. Therefore, a socio-hydrological framework is developed and tested in the Larqui river basin, Chile, based on a field survey to explore the conditions under which the farmers would implement or extend the width of vegetative filter strips (VFS) to prevent soil erosion. The framework consists of a behavioral, social model (extended Theory of Planned Behavior, TPB) and an agent-based model (developed in NetLogo) coupled with the results from the vegetative filter model (Vegetative Filter Strip Modeling System, VFSMOD-W). The results showed that the ABM corroborates with the survey results and the farmers are willing to extend the width of VFS as long as their utility stays positive. This framework can be used to develop tailor-made policies for river basins based on the conditions of the river basins and the stakeholders' requirements to motivate them to adopt sustainable practices. It is vital to assess whether the proposed management plans achieve the expected results for the river basin and if the stakeholders will accept and implement them. The assessment via simulation tools ensures effective implementation and realization of the target stipulated by the decision-makers. In this regard, this dissertation introduces the application of bio-inspired optimization techniques in the field of IRBM. The successful discrete combinatorial optimization in terms of the spatial distribution of mitigation measures by ACO and PSO and the novel socio-hydrological framework using ABM prove the forte and diverse applicability of bio-inspired optimization algorithms

    Towards a more efficient use of computational budget in large-scale black-box optimization

    Get PDF
    Evolutionary algorithms are general purpose optimizers that have been shown effective in solving a variety of challenging optimization problems. In contrast to mathematical programming models, evolutionary algorithms do not require derivative information and are still effective when the algebraic formula of the given problem is unavailable. Nevertheless, the rapid advances in science and technology have witnessed the emergence of more complex optimization problems than ever, which pose significant challenges to traditional optimization methods. The dimensionality of the search space of an optimization problem when the available computational budget is limited is one of the main contributors to its difficulty and complexity. This so-called curse of dimensionality can significantly affect the efficiency and effectiveness of optimization methods including evolutionary algorithms. This research aims to study two topics related to a more efficient use of computational budget in evolutionary algorithms when solving large-scale black-box optimization problems. More specifically, we study the role of population initializers in saving the computational resource, and computational budget allocation in cooperative coevolutionary algorithms. Consequently, this dissertation consists of two major parts, each of which relates to one of these research directions. In the first part, we review several population initialization techniques that have been used in evolutionary algorithms. Then, we categorize them from different perspectives. The contribution of each category to improving evolutionary algorithms in solving large-scale problems is measured. We also study the mutual effect of population size and initialization technique on the performance of evolutionary techniques when dealing with large-scale problems. Finally, assuming uniformity of initial population as a key contributor in saving a significant part of the computational budget, we investigate whether achieving a high-level of uniformity in high-dimensional spaces is feasible given the practical restriction in computational resources. In the second part of the thesis, we study the large-scale imbalanced problems. In many real world applications, a large problem may consist of subproblems with different degrees of difficulty and importance. In addition, the solution to each subproblem may contribute differently to the overall objective value of the final solution. When the computational budget is restricted, which is the case in many practical problems, investing the same portion of resources in optimizing each of these imbalanced subproblems is not the most efficient strategy. Therefore, we examine several ways to learn the contribution of each subproblem, and then, dynamically allocate the limited computational resources in solving each of them according to its contribution to the overall objective value of the final solution. To demonstrate the effectiveness of the proposed framework, we design a new set of 40 large-scale imbalanced problems and study the performance of some possible instances of the framework
    corecore