648 research outputs found

    Enhancing temporal series of Sentinel-2 and Sentinel-3 data products: from classical regression to deep learning approach

    Get PDF
    Treball de Final de Màster Universitari Erasmus Mundus en Tecnologia Geoespacial (Pla de 2013). Codi: SIW013. Curs acadèmic 2020-2021The free and open availability of satellite images covering global extent in recent days provides many novel opportunities for global monitoring of the earth’s surface. Sentinel-2 (S2) and Sentinel-3 (S3) satellite missions capture mid to high resolution imagery with frequent revisit and show data synergy as they both focus on land and ocean observational needs. Specifically, the high temporal resolution of S3 (1-2 day revisit) presents potential in filling the data gaps in S2 (5 day revisit) vegetation products. In this scenario, this study assesses the feasibility of using Sentinel-3 images for Sentinel-2 vegetation products estimation using machine learning (ML) and deep learning (DL) approaches. This study employs four state of the art ML regression algorithms, linear regression, ridge regression, Support Vector Regression (SVR) and Random Forest Regression (RFR) and two DL network architectures with different depth and complexities, Multi-Layer Perceptron (MLP) and Convolutional Neural Network (CNN) to predict the S2 NDVI and SAVI maps from the S3 spectral bands information. A paired S2/S3 dataset is prepared for the study area covering one S2 tile in Extremadura, Spain. The results demonstrate that all the DL architectures except pixel-wise MLP outperformed the ML models with the 3D CNN performing the best. The best performing 3D CNN architecture obtained remarkable mean squared error (MSE) of 0.00198 for NDVI and 0.00282 for SAVI while the best performing ML algorithms were patch-wise RFR with MSE of 0.0035 in case of NDVI and patchwise SVR with MSE of 0.00586 for SAVI. The models and the dataset prepared for this study will be useful for further research that focus on capitalizing the free and open availability of Sentinel-2 and Sentinel-3 imagery as well as new and advanced technologies to provide better vegetation monitoring capabilities for our planet

    Implicit Sensor-based Authentication of Smartphone Users with Smartwatch

    Full text link
    Smartphones are now frequently used by end-users as the portals to cloud-based services, and smartphones are easily stolen or co-opted by an attacker. Beyond the initial log-in mechanism, it is highly desirable to re-authenticate end-users who are continuing to access security-critical services and data, whether in the cloud or in the smartphone. But attackers who have gained access to a logged-in smartphone have no incentive to re-authenticate, so this must be done in an automatic, non-bypassable way. Hence, this paper proposes a novel authentication system, iAuth, for implicit, continuous authentication of the end-user based on his or her behavioral characteristics, by leveraging the sensors already ubiquitously built into smartphones. We design a system that gives accurate authentication using machine learning and sensor data from multiple mobile devices. Our system can achieve 92.1% authentication accuracy with negligible system overhead and less than 2% battery consumption.Comment: Published in Hardware and Architectural Support for Security and Privacy (HASP), 201

    Data Reduction and Deep-Learning Based Recovery for Geospatial Visualization and Satellite Imagery

    Get PDF
    The storage, retrieval and distribution of data are some critical aspects of big data management. Data scientists and decision-makers often need to share large datasets and make decisions on archiving or deleting historical data to cope with resource constraints. As a consequence, there is an urgency of reducing the storage and transmission requirement. A potential approach to mitigate such problems is to reduce big datasets into smaller ones, which will not only lower storage requirements but also allow light load transfer over the network. The high dimensional data often exhibit high repetitiveness and paradigm across different dimensions. Carefully prepared data by removing redundancies, along with a machine learning model capable of reconstructing the whole dataset from its reduced version, can improve the storage scalability, data transfer, and speed up the overall data management pipeline. In this thesis, we explore some data reduction strategies for big datasets, while ensuring that the data can be transferred and used ubiquitously by all stakeholders, i.e., the entire dataset can be reconstructed with high quality whenever necessary. One of our data reduction strategies follows a straightforward uniform pattern, which guarantees a minimum of 75% data size reduction. We also propose a novel variance based reduction technique, which focuses on removing only redundant data and offers additional 1% to 2% deletion rate. We have adopted various traditional machine learning and deep learning approaches for high-quality reconstruction. We evaluated our pipelines with big geospatial data and satellite imageries. Among them, our deep learning approaches have performed very well both quantitatively and qualitatively with the capability of reconstructing high quality features. We also show how to leverage temporal data for better reconstruction. For uniform deletion, the reconstruction accuracy observed is as high as 98.75% on an average for spatial meteorological data (e.g., soil moisture and albedo), and 99.09% for satellite imagery. Pushing the deletion rate further by following variance based deletion method, the decrease in accuracy remains within 1% for spatial meteorological data and 7% for satellite imagery

    Convolutional Neural Networks - Generalizability and Interpretations

    Get PDF

    Tensor Regression

    Full text link
    Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimensional extensions of vectors, are considered as natural representations of high dimensional data. In this book, the authors provide a systematic study and analysis of tensor-based regression models and their applications in recent years. It groups and illustrates the existing tensor-based regression methods and covers the basics, core ideas, and theoretical characteristics of most tensor-based regression methods. In addition, readers can learn how to use existing tensor-based regression methods to solve specific regression tasks with multiway data, what datasets can be selected, and what software packages are available to start related work as soon as possible. Tensor Regression is the first thorough overview of the fundamentals, motivations, popular algorithms, strategies for efficient implementation, related applications, available datasets, and software resources for tensor-based regression analysis. It is essential reading for all students, researchers and practitioners of working on high dimensional data.Comment: 187 pages, 32 figures, 10 table

    Data center's telemetry reduction and prediction through modeling techniques

    Get PDF
    Nowadays, Cloud Computing is widely used to host and deliver services over the Internet. The architecture of clouds is complex due to its heterogeneous nature of hardware and is hosted in large scale data centers. To effectively and efficiently manage such complex infrastructure, constant monitoring is needed. This monitoring generates large amounts of telemetry data streams (e.g. hardware utilization metrics) which are used for multiple purposes including problem detection, resource management, workload characterization, resource utilization prediction, capacity planning, and job scheduling. These telemetry streams require costly bandwidth utilization and storage space particularly at medium-long term for large data centers. Moreover, accurate future estimation of these telemetry streams is a challenging task due to multi-tenant co-hosted applications and dynamic workloads. The inaccurate estimation leads to either under or over-provisioning of data center resources. In this Ph.D. thesis, we propose to improve the prediction accuracy and reduce the bandwidth utilization and storage space requirement with the help of modeling and prediction methods from machine learning. Most of the existing methods are based on a single model which often does not appropriately estimate different workload scenarios. Moreover, these prediction methods use a fixed size of observation windows which cannot produce accurate results because these are not adaptively adjusted to capture the local trends in the recent data. Therefore, the estimation method trains on fixed sliding windows use an irrelevant large number of observations which yields inaccurate estimations. In summary, we C1) efficiently reduce bandwidth and storage for telemetry data through real-time modeling using Markov chain model. C2) propose a novel method to adaptively and automatically identify the most appropriate model to accurately estimate data center resources utilization. C3) propose a deep learning-based adaptive window size selection method which dynamically limits the sliding window size to capture the local trend in the latest resource utilization for building estimation model.Hoy en día, Cloud Computing se usa ampliamente para alojar y prestar servicios a través de Internet. La arquitectura de las nubes es compleja debido a su naturaleza heterogénea del hardware y está alojada en centros de datos a gran escala. Para administrar de manera efectiva y eficiente dicha infraestructura compleja, se necesita un monitoreo constante. Este monitoreo genera grandes cantidades de flujos de datos de telemetría (por ejemplo, métricas de utilización de hardware) que se utilizan para múltiples propósitos, incluyendo detección de problemas, gestión de recursos, caracterización de carga de trabajo, predicción de utilización de recursos, planificación de capacidad y programación de trabajos. Estas transmisiones de telemetría requieren una utilización costosa del ancho de banda y espacio de almacenamiento, particularmente a mediano y largo plazo para grandes centros de datos. Además, la estimación futura precisa de estas transmisiones de telemetría es una tarea difícil debido a las aplicaciones cohospedadas de múltiples inquilinos y las cargas de trabajo dinámicas. La estimación inexacta conduce a un suministro insuficiente o excesivo de los recursos del centro de datos. En este Ph.D. En la tesis, proponemos mejorar la precisión de la predicción y reducir la utilización del ancho de banda y los requisitos de espacio de almacenamiento con la ayuda de métodos de modelado y predicción del aprendizaje automático. La mayoría de los métodos existentes se basan en un modelo único que a menudo no estima adecuadamente diferentes escenarios de carga de trabajo. Además, estos métodos de predicción utilizan un tamaño fijo de ventanas de observación que no pueden producir resultados precisos porque no se ajustan adaptativamente para capturar las tendencias locales en los datos recientes. Por lo tanto, el método de estimación entrena en ventanas corredizas fijas utiliza un gran número de observaciones irrelevantes que produce estimaciones inexactas. En resumen, C1) reducimos eficientemente el ancho de banda y el almacenamiento de datos de telemetría a través del modelado en tiempo real utilizando el modelo de cadena de Markov. C2) proponer un método novedoso para identificar de forma adaptativa y automática el modelo más apropiado para estimar con precisión la utilización de los recursos del centro de datos. C3) proponer un método de selección de tamaño de ventana adaptativo basado en el aprendizaje profundo que limita dinámicamente el tamaño de ventana deslizante para capturar la tendencia local en la última utilización de recursos para el modelo de estimación de construcción.Postprint (published version
    corecore