95 research outputs found

    A survey on tidal analysis and forecasting methods for Tsunami detection

    Get PDF
    Accurate analysis and forecasting of tidal level are very important tasks for human activities in oceanic and coastal areas. They can be crucial in catastrophic situations like occurrences of Tsunamis in order to provide a rapid alerting to the human population involved and to save lives. Conventional tidal forecasting methods are based on harmonic analysis using the least squares method to determine harmonic parameters. However, a large number of parameters and long-term measured data are required for precise tidal level predictions with harmonic analysis. Furthermore, traditional harmonic methods rely on models based on the analysis of astronomical components and they can be inadequate when the contribution of non-astronomical components, such as the weather, is significant. Other alternative approaches have been developed in the literature in order to deal with these situations and provide predictions with the desired accuracy, with respect also to the length of the available tidal record. These methods include standard high or band pass filtering techniques, although the relatively deterministic character and large amplitude of tidal signals make special techniques, like artificial neural networks and wavelets transform analysis methods, more effective. This paper is intended to provide the communities of both researchers and practitioners with a broadly applicable, up to date coverage of tidal analysis and forecasting methodologies that have proven to be successful in a variety of circumstances, and that hold particular promise for success in the future. Classical and novel methods are reviewed in a systematic and consistent way, outlining their main concepts and components, similarities and differences, advantages and disadvantages

    Ensembling classical machine learning and deep learning approaches for morbidity identification from clinical notes

    Get PDF
    The past decade has seen an explosion of the amount of digital information generated within the healthcare domain. Digital data exist in the form of images, video, speech, transcripts, electronic health records, clinical records, and free-text. Analysis and interpretation of healthcare data is a daunting task, and it demands a great deal of time, resources, and human effort. In this paper, we focus on the problem of co-morbidity recognition from patient’s clinical records. To this aim, we employ both classical machine learning and deep learning approaches.We use word embeddings and bag-of-words representations, coupled with feature selection techniques. The goal of our work is to develop a classification system to identify whether a certain health condition occurs for a patient by studying his/her past clinical records. In more detail, we have used pre-trained word2vec, domain-trained, GloVe, fastText, and universal sentence encoder embeddings to tackle the classification of sixteen morbidity conditions within clinical records. We have compared the outcomes of classical machine learning and deep learning approaches with the employed feature representation methods and feature selection methods. We present a comprehensive discussion of the performances and behaviour of the employed classical machine learning and deep learning approaches. Finally, we have also used ensemble learning techniques over a large number of combinations of classifiers to improve the single model performance. For our experiments, we used the n2c2 natural language processing research dataset, released by Harvard Medical School. The dataset is in the form of clinical notes that contain patient discharge summaries. Given the unbalancedness of the data and their small size, the experimental results indicate the advantage of the ensemble learning technique with respect to single classifier models. In particular, the ensemble learning technique has slightly improved the performances of single classification models but has greatly reduced the variance of predictions stabilizing the accuracies (i.e., the lower standard deviation in comparison with single classifiers). In real-life scenarios, our work can be employed to identify with high accuracy morbidity conditions of patients by feeding our tool with their current clinical notes. Moreover, other domains where classification is a common problem might benefit from our approach as well

    TF-IDF vs word embeddings for morbidity identification in clinical notes: An initial study

    Get PDF
    Today, we are seeing an ever-increasing number of clinical notes that contain clinical results, images, and textual descriptions of patient's health state. All these data can be analyzed and employed to cater novel services that can help people and domain experts with their common healthcare tasks. However, many technologies such as Deep Learning and tools like Word Embeddings have started to be investigated only recently, and many challenges remain open when it comes to healthcare domain applications. To address these challenges, we propose the use of Deep Learning and Word Embeddings for identifying sixteen morbidity types within textual descriptions of clinical records. For this purpose, we have used a Deep Learning model based on Bidirectional Long-Short Term Memory (LSTM) layers which can exploit state-of-the-art vector representations of data such as Word Embeddings. We have employed pre-trained Word Embeddings namely GloVe and Word2Vec, and our own Word Embeddings trained on the target domain. Furthermore, we have compared the performances of the deep learning approaches against the traditional tf-idf using Support Vector Machine and Multilayer perceptron (our baselines). From the obtained results it seems that the latter outperform the combination of Deep Learning approaches using any word embeddings. Our preliminary results indicate that there are specific features that make the dataset biased in favour of traditional machine learning approaches

    A blockchain-based distributed paradigm to secure localization services

    Get PDF
    In recent decades, modern societies are experiencing an increasing adoption of interconnected smart devices. This revolution involves not only canonical devices such as smartphones and tablets, but also simple objects like light bulbs. Named the Internet of Things (IoT), this ever-growing scenario offers enormous opportunities in many areas of modern society, especially if joined by other emerging technologies such as, for example, the blockchain. Indeed, the latter allows users to certify transactions publicly, without relying on central authorities or intermediaries. This work aims to exploit the scenario above by proposing a novel blockchain-based distributed paradigm to secure localization services, here named the Internet of Entities (IoE). It represents a mechanism for the reliable localization of people and things, and it exploits the increasing number of existing wireless devices and blockchain-based distributed ledger technologies. Moreover, unlike most of the canonical localization approaches, it is strongly oriented towards the protection of the users’ privacy. Finally, its implementation requires minimal efforts since it employs the existing infrastructures and devices, thus giving life to a new and wide data environment, exploitable in many domains, such as e-health, smart cities, and smart mobility

    Statistical arbitrage powered by Explainable Artificial Intelligence

    Get PDF
    Machine learning techniques have recently become the norm for detecting patterns in financial markets. However, relying solely on machine learning algorithms for decision-making can have negative consequences, especially in a critical domain such as the financial one. On the other hand, it is well-known that transforming data into actionable insights can pose a challenge even for seasoned practitioners, particularly in the financial world. Given these compelling reasons, this work proposes a machine learning approach powered by eXplainable Artificial Intelligence techniques integrated into a statistical arbitrage trading pipeline. Specifically, we propose three methods to discard irrelevant features for the prediction task. We evaluate the approaches on historical data of component stocks of the S&P500 index and aim at improving not only the prediction performance at the stock level but also overall at the stock set level. Our analysis shows that our trading strategies that include such feature selection methods improve the portfolio performances by providing predictive signals whose information content suffices and is less noisy than the one embedded in the whole feature set. By performing an in-depth risk-return analysis, we show that the proposed trading strategies powered by explainable AI outperform highly competitive trading strategies considered as baselines

    CulturAI: Semantic Enrichment of Cultural Data Leveraging Artificial Intelligence

    Get PDF
    In this paper, we propose an innovative tool able to enrich cultural and creative spots (gems, hereinafter) extracted from the European Commission Cultural Gems portal, by suggesting relevant keywords (tags) and YouTube videos (represented with proper thumbnails). On the one hand, the system queries the YouTube search portal, selects the videos most related to the given gem, and extracts a set of meaningful thumbnails for each video. On the other hand, each tag is selected by identifying semantically related popular search queries (i.e., trends). In particular, trends are retrieved by querying the Google Trends platform. A further novelty is that our system suggests contents in a dynamic way. Indeed, as for both YouTube and Google Trends platforms the results of a given query include the most popular videos/trends, such that a gem may constantly be updated with trendy content by periodically running the tool. The system has been tested on a set of gems and evaluated with the support of human annotators. The results highlighted the effectiveness of our proposal

    Explainable Machine Learning Exploiting News and Domain-Specific Lexicon for Stock Market Forecasting

    Get PDF
    In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the SP 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label (high or low) represents the underlying company's stock price variation on the next day, being either higher or lower than a certain threshold. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, shows that our approach clearly outperforms the competitors. Moreover, the devised Artificial Intelligence (AI) approach is explainable, in the sense that we analyze the white-box behind the classifier and provide a set of explanations on the obtained results

    Ensembling and Dynamic Asset Selection for Risk-Controlled Statistical Arbitrage

    Get PDF
    In recent years, machine learning algorithms have been successfully employed to leverage the potential of identifying hidden patterns of financial market behavior and, consequently, have become a land of opportunities for financial applications such as algorithmic trading. In this paper, we propose a statistical arbitrage trading strategy with two key elements: an ensemble of regression algorithms for asset return prediction, followed by a dynamic asset selection. More specifically, we construct an extremely heterogeneous ensemble ensuring model diversity by using state-of-the-art machine learning algorithms, data diversity by using a feature selection process, and method diversity by using individual models for each asset, as well models that learn cross-sectional across multiple assets. Then, their predictive results are fed into a quality assurance mechanism that prunes assets with poor forecasting performance in the previous periods. We evaluate the approach on historical data of component stocks of the SP500 index. By performing an in-depth risk-return analysis, we show that this setup outperforms highly competitive trading strategies considered as baselines. Experimentally, we show that the dynamic asset selection enhances overall trading performance both in terms of return and risk. Moreover, the proposed approach proved to yield superior results during both financial turmoil and massive market growth periods, and it showed to have general application for any risk-balanced trading strategy aiming to exploit different asset classes

    Survey on Videos Data Augmentation for Deep Learning Models

    No full text
    In most Computer Vision applications, Deep Learning models achieve state-of-the-art performances. One drawback of Deep Learning is the large amount of data needed to train the models. Unfortunately, in many applications, data are difficult or expensive to collect. Data augmentation can alleviate the problem, generating new data from a smaller initial dataset. Geometric and color space image augmentation methods can increase accuracy of Deep Learning models but are often not enough. More advanced solutions are Domain Randomization methods or the use of simulation to artificially generate the missing data. Data augmentation algorithms are usually specifically designed for single images. Most recently, Deep Learning models have been applied to the analysis of video sequences. The aim of this paper is to perform an exhaustive study of the novel techniques of video data augmentation for Deep Learning models and to point out the future directions of the research on this topic

    Deep learning and sentiment analysis for human-robot interaction

    No full text
    n this paper we present an ongoing work showing to what extent semantic technologies, deep learning and natural language processing can be applied within the field of Human-Robot Interaction. The project has been developed for Zora, a completely programmable and autonomous humanoid robot, and it aims at allowing Zora to interact with humans using natural language. The robot is capable of talking to the user and understanding sentiments by leveraging our external services, such as a Sentiment Analysis engine and a Generative Conversational Agent, which is responsible for generating Zora’s answers to open-dialog natural language utterances
    • …
    corecore