7 research outputs found

    Прогнозування часових рядів ринкового курсу криптовалют

    Get PDF
    Кваліфікаційна робота: 89 с., 6 табл., 20 рис., 2 додатки, 20 джерел. Об'єкт дослідження: прогнозування часових рядів рекурентними нейронними мережами LSTM та GRU. Мета роботи: поглибити знання щодо прогнозування часових рядів рекурентними нейронними мережами LSTM та GRU, розробити архітектури нейронних мереж з шарами LSTM та GRU, інтерпретувати параметри мереж та отримати змістовні результати на практичній задачі прогнозування фінансового повернення з активу. Методи досліджень: поняття і методи фінансового технічного аналізу, методів оптимізації, алгоритми та підходи машинного навчання, нейронних мереж. Одержані висновки та їх новизна: нейронні мережі є потужним сучасним підходом для вирішення задачі прогнозування часових рядів. Розроблені архітектури мереж пропонують як і низько ризикові, так і високо ризикові точки входу в фінансову угоду. Результати досліджень можуть бути застосовані при використанні моделей як торгових сигналів експертами інвестиційної криптовалютної галузі. В залежності від типу торгівлі можна використовувати модель із шарами LSTM для низько ризикової торгівля та модель із шарами GRU для високо ризикової торгівлі. Наведено інтерпретацію параметрів нейронних мереж, що вирішує проблему «чорного ящику».Thesis: 89 pages, 26 pictures, 6 tables, 2 appendices, 20 references. The graduation research of the 4-year student D.Radchenko (National technical university of Ukraine “Kyiv polytechnic institute named Igor Sikorsky” Educational and scientific institute of Applied systems analysis) deals with predicting crypto asset returns using recurrent neural networks. The main goal is to get deeper knowledge about time series prediction using RNN and develop network architecture with LSTM and GRU layers. Another intention is to make developed neural networks interpretable, which can be solved using Shapley values. We developed two models during the research. The former model has LSTM layers, its purpose was to make low-risk prediction during the 2022 fall of crypto market; The latter model has GRU layers for making high-risk predictions. Training dataset includes not only prices for target asset, but also prices of biggest crypto assets BTC and ETH. Both models were trained using regularization layers with Dropout and Batch Normalization and modificated ReLU activation functions, such as Leaky RELU and ELU. It was used Shapley values to make each model feature impact clear. The research results can be applied as a trading signal for market experts. According to type of trading one can choose either high-risk model or low-risk. Moreover, it is possible to get deeply understanding of models by looking at features importance

    Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series—A Case Study in Zhanjiang, China

    No full text
    Timely and accurate estimation of the area and distribution of crops is vital for food security. Optical remote sensing has been a key technique for acquiring crop area and conditions on regional to global scales, but great challenges arise due to frequent cloudy days in southern China. This makes optical remote sensing images usually unavailable. Synthetic aperture radar (SAR) could bridge this gap since it is less affected by clouds. The recent availability of Sentinel-1A (S1A) SAR imagery with a 12-day revisit period at a high spatial resolution of about 10 m makes it possible to fully utilize phenological information to improve early crop classification. In deep learning methods, one-dimensional convolutional neural networks (1D CNNs), long short-term memory recurrent neural networks (LSTM RNNs), and gated recurrent unit RNNs (GRU RNNs) have been shown to efficiently extract temporal features for classification tasks. However, due to the complexity of training, these three deep learning methods have been less used in early crop classification. In this work, we attempted to combine them with an incremental classification method to avoid the need for training optimal architectures and hyper-parameters for data from each time series. First, we trained 1D CNNs, LSTM RNNs, and GRU RNNs based on the full images’ time series to attain three classifiers with optimal architectures and hyper-parameters. Then, starting at the first time point, we performed an incremental classification process to train each classifier using all of the previous data, and obtained a classification network with all parameter values (including the hyper-parameters) at each time point. Finally, test accuracies of each time point were assessed for each crop type to determine the optimal time series length. A case study was conducted in Suixi and Leizhou counties of Zhanjiang City, China. To verify the effectiveness of this method, we also implemented the classic random forest (RF) approach. The results were as follows: (i) 1D CNNs achieved the highest Kappa coefficient (0.942) of the four classifiers, and the highest value (0.934) in the GRU RNNs time series was attained earlier than with other classifiers; (ii) all three deep learning methods and the RF achieved F measures above 0.900 before the end of growth seasons of banana, eucalyptus, second-season paddy rice, and sugarcane; while, the 1D CNN classifier was the only one that could obtain an F-measure above 0.900 for pineapple before harvest. All results indicated the effectiveness of the solution combining the deep learning models with the incremental classification approach for early crop classification. This method is expected to provide new perspectives for early mapping of croplands in cloudy areas

    Convolutional Neural Networks and their Application in Cancer Diagnosis based on RNA-Sequencing

    Get PDF
    Η έκφραση γονιδίων αποτελεί τη μελέτη της λειτουργίας της γονιδιακής μεταγραφής, κατά την οποία συνθέτονται γονιδιακά προϊόντα, είδη RNA ή πρωτεΐνες. Η μελέτη της παρέχει την κατανόηση των κυτταρικών λειτουργιών, όπως η κυτταρική διαφοροποίηση και οι μη φυσιολογικές παθολογικές λειτουργίες. Ο καρκίνος αποτελεί μία γενετική ασθένεια όπου γενετικές παραλλαγές προκαλούν μη φυσιολογικές λειτουργίες στα γονίδια και τροποποιούν την έκφραση τους. Οι πρωτεΐνες, οι οποίες αποτελούν το τελικό αποτέλεσμα της έκφρασης γονιδίων, καθορίζουν τους φαινοτύπους και τις βιολογικές λειτουργίες. Συνεπώς, η ανίχνευση των επιπέδων έκφρασης γονιδίων δύναται να χρησιμοποιηθεί στη διάγνωση, την πρόγνωση, ακόμα και την επιλογή της θεραπείας του καρκίνου. Σε αυτή την πτυχιακή θα αναλυθεί η θεωρία και οι εφαρμογές της Βαθειάς Μάθησης. Στη συνέχεια, θα εφαρμοστεί η Βαθειά Μάθηση και πιο συγκεκριμένα ένα Συνελικτικό Νευρωνικό Δίκτυο, ως μέσο για τη διάγνωση πολλαπλών τύπων καρκίνου (κατηγοριοποίηση καρκίνων) χρησιμοποιώντας δεδομένα έκφρασης γονιδίων, και πιο συγκεκριμένα αλληλουχίες RNA. Τα δεδομένα του «The Cancer Genome Atlas» (TCGA) αποτελούνται από αλληλουχίες RNA. Θα επεξεργαστούν σε πρώτο επίπεδο και μετά θα μετατραπούν σε πολλαπλές δισδιάστατες εικόνες. Οι εικόνες αυτές θα εισαχθούν σε ένα Συνελικτικό Νευρωνικό Δίκτυο, το οποίο θα τις κατηγοριοποιήσει σε 33 τύπους καρκίνου, αποσκοπώντας στην διάγνωση με τη μέγιστη δυνατή ακρίβεια.Gene expression analysis is the study of the way genes are transcribed to synthesize functional gene products, functional RNA species, or protein products. Its study can provide insights of cellular processes, such as cellular differentiation and abnormal pathological processes. Cancer is a genetic disease where genetic variations cause abnormally functioning genes that appear to alter expression. Proteins, being the final products of gene expression, define the phenotypes and biological processes. Therefore, detecting gene expression levels can be used for cancer diagnosis, prognosis, and even treatment prediction. This thesis will be analyzing the theory and applications of Deep Learning. It will then apply Deep Learning (DL) and in particular a Convolutional Neural Network (CNN) as a means for the diagnosis of multiple cancer types (pan-cancer classification) using gene expression data and specifically RNA-sequencing. The Cancer Genome Atlas (TCGA) data, which consists of RNA-sequencing, will be preprocessed and then embedded into multiple two-dimensional (2D) images. These images will then be applied to a Convolutional Neural Network which will classify them into 33 types of cancer, in an attempt to achieve the highest possible diagnosis accuracy

    Análise de inundações e classificação da cobertura vegetal no bioma amazônico usando séries temporais sentinel-1 SAR e técnicas de deep learning

    Get PDF
    Tese (doutorado) — Universidade de Brasília, Instituto de Ciências Humanas, Departamento de Geografia, Programa de Pós-Graduação em Geografia, 2022.Os recursos hídricos e os estudos fenológicos florestais são extremamente importantes para a compreensão de diversos fenômenos naturais como as mudanças climáticas, dinâmica hidrogeomorfológica, condicionamento ambiental e gestão dos recursos. Inserida na dinâmica hídrica, estão presentes as áres inundáveis que estão intrinsecamente ligadas à manuntenção da biota e da fauna nos biomas brasileiros. Nesse contexto, os produtos derivados de sensoriamento remoto têm sido bastante utilizados para a análise e monitoramento de áreas inundáveis, mapeamento de uso e ocupação da terra e dinâmica fenológica devido à sua importância ambiental. As imagens de radar de abertura sintética (SAR) são produtos potenciais por não apresentar interferências atmosféricas, entretanto, necessitam de diversos tratamentos iniciais, definidos de pré-processamento para assim ser possível obter uma melhor extração de informações de uma determinada área. Nesse sentido, essa pesquisa teve como objetivo aplicar as técnicas de deep learning utilizando algoritmos de processamento de séries temporais de imagens de satélite baseados em redes neurais para extração e identificação de áreas inundáveis, corpos hídricos e fenologias florestais em áreas de cerrado, floresta amazônica, mangues, cultivos agrícolas e várzea. O presente estudo foi dividido em três capítulos principais: (a) análises métricas e estatísticas de filtragens espaciais em imagem Sentinel-1 SAR da Amazônia Central, Brasil; (b) análise de série temporal Sentinel-1 SAR em inundações na Amazônia Central; e (c) classificação fenológica de floresta, mangues, cerrado e vegetação alagada do bioma Amazônia por meio de comparação dos modelos LSTM, Bi-LSTM, GRU, Bi-GRU e modelos de aprendizagem de máquina baseados em séries temporais do satélite Sentinel-1. As etapas metodológicas foram distintas para cada capítulo e todos apresentaram precisão e altos valores métricos para mensuração e análise dos corpos hídricos, inundação e fenologias florestais. Dentre os métodos de filtragem analisados na imagem SAR, o filtro Lee com janela 3 × 3 apresentou os melhores desempenhos na redução do ruído speckle (MSE igual a 1,88 e MAE igual a 1,638) e baixo valor de distorção de contraste na polarização VH. Entretanto, para a polarização VV, mensuraram-se diferentes resultados para análise da redução do ruído speckle, onde o filtro Frost com janela 3 × 3 apresentou o melhor desempenho, com baixo valor para as métricas em geral (MSE igual a 1,2 e MAE igual a 6,28) e também um baixo valor de distorção de contraste. Por apresentar os melhores valores estatísticos, o filtro de mediana com janela 11 × 11 nas polarizações VH e VV pode ser utilizado como uma técnica de filtragem alternativa na imagem Sentinel-1 nas duas polarizações. As áreas de inundação mensuradas nas polarizações VH e VV apresentaram uma forte correlação e sem significância estatística entre as amostras, presumindo que se pode utilizar as duas polarizações para obtenção do pulso de inundação e mapeamento da dinâmica das áreas inundáveis na região. Pelo fato de não haver imagens Sentinel-1 anteriores ao ano de 2016, quando os eventos extremos de LMEO foram superiores a 100%, não foi possível delimitar a LMEO por meio de dados SAR. Algumas áreas ao longo da costa e rios apresentam assinaturas temporais de retroespalhamento que evidenciam transições entre ambientes terrestres e áreas cobertas por água. A variação temporal do retroespalhamento de valores mais altos para mais baixos indica erosão e inundação progressiva, enquanto o inverso indica aumento terrestre. O modelo Bi-GRU apresentou a maior acurácia geral, precisão, recall e F-score tanto na polarização individual como na polarização combinada VV+VH. A combinação entre as polarizações forneceu os melhores resultados na classificação e a polarização VH obteve melhores resultados quando comparado à polarização VV. O presente estudo atestou o procedimento metodológico adequado para mensurar as áreas de corpos hídricos e seu pulso de inundação como também obteve a classificação de fenologias com alta precisão na Amazônia Central por meio de deep learning advindas de série temporal de imagens Sentinel-1 SAR.Water resources and forest phenological studies are extremely important for the understanding of various natural phenomena, such as climate variation, hydrogeomorphological dynamics, environmental conditioning, and resource management. In this context, products derived from remote sensing have been widely used for the analysis and monitoring of flooding areas, land use and occupation mapping, and phenological dynamics due to their environmental importance. Synthetic aperture radar (SAR) images are potential products as they do not present atmospheric interference, however, they require several initial treatments, defined as pre-processing, so that it is possible to obtain a better extraction of information from a certain area. In this sense, this research aimed to apply deep learning techniques using algorithms based on neural networks for the extraction and identification of flooding areas, water bodies, and forest phenologies such as cerrado, Amazon forest, mangroves, agricultural crops, and floodplain through time series of remote sensing images. This study was divided into three main chapters: (a) metric and statistical analysis of spatial filtering in Sentinel-1 SAR images of Central Amazonia, Brazil; (b) Sentinel-1 SAR time series analysis in flooding areas of Central Amazon; and (c) phenological classification of forest, mangroves, savannas, and two flooded vegetation of the Amazon biome by comparing LSTM, Bi-LSTM, GRU, Bi-GRU, and machine learning models from Sentinel-1 time series. The methodological steps were different for each chapter and all presented precision and high metric values for measurement and analysis of water bodies, flooding and forest phenologies. Among the filtering methods analyzed in the SAR image, the Lee filter with 3 × 3 window presented the best performance in reducing speckle noise (MSE of 1.88 and MAE of 1.638) and low value of contrast distortion in the VH polarization. However, for the VV polarization, different results were measured for the analysis of the sepeckle noise reduction, where the Frost filter with 3 × 3 window presented the best performance, with a low value for the metrics in general (MSE of 1.2 and MAE of 6.28) and also a low contrast distortion value. Statistical values derived from the median filter with 11 × 11 window in the VH and VV polarizations can be used as an alternative filtering technique in the Sentinel-1 SAR image in both polarizations. The flooding areas measured in the VH and VV polarizations showed a strong correlation and no statistical significance between the samples, assuming that both polarizations can be used to obtain the flood pulse and mapping the dynamics of the flooded areas in the region. Because there are no Sentinel1 SAR images prior to 2016 when extreme LMEO events were greater than 100%, it was not possible to delimit the LMEO through SAR data. Some areas along the coast and rivers show temporal backscatter signatures with transitions between terrestrial environments and areas covered by water. The temporal variation of backscatter from higher to lower values indicates erosion and progressive flooding, while the inverse indicates terrestrial increase. The Bi-GRU model showed the highest overall accuracy, precision, recall, and F-score in both separate polarization and combined VV+VH polarization. The combination between the polarizations provided the best results in the classification and the VH polarization obtained better results when compared to the VV polarization. This study attested an adequate methodological procedure to measure the areas of water bodies and their flood pulse, as well as obtaining the classification of phenologies with high precision in the Central Amazon by means of deep learning applied to the time series of Sentinel-1 SAR images

    Machine Learning Prediction of Mechanical and Durability Properties of Recycled Aggregates Concrete

    Get PDF
    Whilst recycled aggregate (RA) can alleviate the environmental footprint of concrete production and the landfilling of colossal amounts of demolition waste, there need for robust predictive tools for its effects on mechanical and durability properties. In this thesis, state-of-the-art machine learning (ML) models were deployed to predict properties of recycled aggregate concrete (RAC). A systematic review was performed to analyze pertinent ML techniques previously applied in the concrete technology field. Accordingly, three different ML methods were selected to determine the compressive strength of RAC and perform mixture proportioning optimization. Furthermore, a gradient boosting regression tree was used to study the effects of RA and several types of binders on the carbonation depth of RAC. The ML models developed in this study demonstrated robust performance to predict diverse properties of RAC

    Large Area Land Cover Mapping Using Deep Neural Networks and Landsat Time-Series Observations

    Get PDF
    This dissertation focuses on analysis and implementation of deep learning methodologies in the field of remote sensing to enhance land cover classification accuracy, which has important applications in many areas of environmental planning and natural resources management. The first manuscript conducted a land cover analysis on 26 Landsat scenes in the United States by considering six classifier variants. An extensive grid search was conducted to optimize classifier parameters using only the spectral components of each pixel. Results showed no gain in using deep networks by using only spectral components over conventional classifiers, possibly due to the small reference sample size and richness of features. The effect of changing training data size, class distribution, or scene heterogeneity were also studied and we found all of them having significant effect on classifier accuracy. The second manuscript reviewed 103 research papers on the application of deep learning methodologies in remote sensing, with emphasis on per-pixel classification of mono-temporal data and utilizing spectral and spatial data dimensions. A meta-analysis quantified deep network architecture improvement over selected convolutional classifiers. The effect of network size, learning methodology, input data dimensionality and training data size were also studied, with deep models providing enhanced performance over conventional one using spectral and spatial data. The analysis found that input dataset was a major limitation and available datasets have already been utilized to their maximum capacity. The third manuscript described the steps to build the full environment for dataset generation based on Landsat time-series data using spectral, spatial, and temporal information available for each pixel. A large dataset containing one sample block from each of 84 ecoregions in the conterminous United States (CONUS) was created and then processed by a hybrid convolutional+recurrent deep network, and the network structure was optimized with thousands of simulations. The developed model achieved an overall accuracy of 98% on the test dataset. Also, the model was evaluated for its overall and per-class performance under different conditions, including individual blocks, individual or combined Landsat sensors, and different sequence lengths. The analysis found that although the deep model performance per each block is superior to other candidates, the per block performance still varies considerably from block to block. This suggests extending the work by model fine-tuning for local areas. The analysis also found that including more time stamps or combining different Landsat sensor observations in the model input significantly enhances the model performance

    Seguimiento y clasificación de parámetros biofísicos de superficies agrícolas a partir de sensores remotos radar

    Get PDF
    [ES] El seguimiento y la clasificación de los cultivos agrícolas tienen una gran importancia en la gestión socio-económica de las sociedades y son esenciales para la gestión sostenible de las actividades agrícolas. Con esta información, autoridades locales, nacionales o internacionales, cooperativas agrícolas o agricultores, pueden tener acceso a información precisa y actualizada para poder llevar a cabo una mejor gestión de los cultivos, además de obtener información sobre el crecimiento de los cultivos o la estimación de su rendimiento. El empleo de la teledetección, al ser una forma no destructiva de monitorear la vegetación, es una herramienta ideal para ayudar a logar la información necesaria. Y su cobertura temporal ininterrumpida permite seguir los ciclos fenológicos de las plantas. Aunque la teledetección óptica se ha utilizado con éxito para el seguimiento y clasificación de cultivos agrícolas, estos sistemas se limitan a los datos adquiridos en condiciones de cielo despejado. En este contexto, los datos adquiridos por sensores radar de apertura sintética (SAR) son de gran interés para aplicaciones agrícolas debido a la capacidad de estos sistemas para monitorear los cultivos en todas las condiciones climáticas y la sensibilidad de la señal de microondas a las propiedades dieléctricas y geométricas del objetivo. Dependiendo de la configuración del sistema, los sensores SAR pueden adquirir datos en diferentes modos. La adquisición de datos en diferentes modos ha establecido técnicas de procesamiento como la polarimetría (PolSAR), interferometría (InSAR) e interferometría diferencial (DInSAR). Para el desarrollo de esta tesis se ha empleado la polarimetría, ya que en el ámbito de la agricultura el empleo de esta técnica se basa en la bien conocida sensibilidad de las microondas a la estructura del cultivo, las propiedades dieléctricas del dosel y las propiedades físicas del suelo subyacente. Los objetivos de esta tesis han sido varios. Por una parte, ampliar el conocimiento de los observables SAR (más allá de los coeficientes de retrodispersión) para el seguimiento/monitoreo de cultivos; investigar el efecto del ángulo de incidencia en la relación entre los observables polarimétricos y diferentes variables biofísicas; y finalmente, estudiar la viabilidad de los observables SAR para clasificar y distinguir cultivos agrícolas. Para llevar a cabo el primer y segundo objetivo se empleó una serie temporal de 20 imágenes RADARSAT-2 adquiridas a diferentes ángulos de incidencia (25°, 31° y 36°) durante la temporada de crecimiento de cultivos de secano. A partir de las imágenes se extrajeron 10 observables polarimétricos, mientras que 6 variables biofísicas se estimaron a partir de mediciones in situ. Posteriormente, se realizó un análisis descriptivo y de correlación estadística entre ambos conjuntos de datos. Los resultados expuestos en esta tesis muestran correlaciones significativas entre varios observables polarimétricos (HH/VV, HV/VV, γHHVV, α1, γP1P2) con varias variables biofísicas como la biomasa, la altura y el índice de área foliar para ángulos de incidencia de 31° y 36°. Para cumplir con el último objetivo, se realizó una clasificación de cultivos aplicando un algoritmo de aprendizaje automático y usando como datos de entrada para el clasificador los 10 observables polarimétricos de la serie temporal de RADARSAT-2 junto con 3 observables más extraídos de una serie temporal de imágenes Sentinel-1. Debido a la gran cantidad de datos, se crearon 7 escenarios distintos para evaluar la clasificación. El empleo de todos los observables e imágenes RADARSAT-2 demostró tener claros beneficios en términos de precisión general a la hora de clasificar. El análisis individual para cubierta mostró la buena separación de los cereales de primavera, típicamente difícil debido a su estructura y fenología similares; mientras que los cultivos de verano mostraron resultados no tan buenos de exactitud debido a la falta de imágenes en esas fechas. En cuanto a las capacidades polarimétricas de RADARSAT-2 (full) y Sentinel-1 (dual) son bastante diferentes, el enfoque multitemporal reforzó el proceso de clasificación y proporcionó resultados satisfactorios similares para los diferentes escenarios de clasificación propuestos
    corecore