161 research outputs found

    A Multi-Metric Adaptive Stream Processing System

    Get PDF
    International audienceStream processing systems (SPS) have to deal with highly dynamic scenarios where its adaptation is mandatory in order to accomplish realistic applications requirements. In this work, we propose a new adaptive SPS for real-time processing that, based on input data rate variation, dynamically adapts the number of active operator replicas. Our SPS extends Storm by pre-allocating, for each operator, a set of inactive replicas which are activated (or deactivated) when necessary without the Storm reconfiguration cost. We exploit the MAPE model and define a new metric that aggregates the value of multiple metrics to dynamically changes the number of replicas of an operator. We deploy our SPS over Google Cloud Platform and results confirm that our metric can tolerate highly dynamic conditions, improving resource usage while preserving high throughput and low latency

    Observation of current approaches to utilize the elastic cloud for big data stream processing

    Get PDF
    This paper conducts a systematic literature map to collect information about current approaches to utilize the elastic cloud for data stream processing in the big data context. First is a description and setup of the used scientific methodology which adheres to generally accepted methods for systematic literature maps. After building a reference set and constructing search queries for the data collection came the data set cleaning where the publications were first automatically filtered and consecutively manually reviewed to determine the relevant papers. The collected data was evaluated and visualized to help answer the defined research questions and present information. Finally the results of the thesis are discussed and the limitations and implications addressed.Diese Arbeit befasst sich mit der Durchführung einer Systematic Literature Map um einen Überblick über ein Feld zu gewähren. Das untersuchte Feld dieser Arbeit befasst sich mit der Verwendung der elastischen Eigenschaften der Cloud für Datenstrom Prozessierung im Big Data Umfeld. Bestandteil der Systematic Literature Map ist sowohl das Sammeln aller Publikationen, welche für das untersuchte Feld relevant sind, als auch die Auswertung und Präsentation der gesammelten Daten. Um die Informationen zielgerichtet zu evaluieren, wurden Forschungsfragen definiert, welche als Leitfaden dienen. Zu Beginn wurden die verwendeten wissenschaftlichen Methoden vorgestellt, welche sich an anerkannten Prozeduren orientieren. Nach dem zusammenstellen von einigen relevanten Publikationen, wurden auf deren Basis Suchanfragen für die Datensammlung erstellt. Danach wurden die Daten aus den Online Datenbanken bekannter Verleger exportiert und Duplikate entfernt. Um die endgültigen relevanten Publikationen festzustellen, wurden anhand von Schlagworten irrelevante Publikationen aussortiert und schließlich manuell einzeln bewertet. Die gesammelten Daten wurden teilweise automatisch ausgewertet und manuell klassifiziert um mit den Ergebnissen die vorher definierten Forschungsfragen zu beantworten. Abschließend werden die Ergebnisse diskutiert und die Einschränkungen und Implikationen dieser Arbeit behandelt

    Security Configuration Management in Intrusion Detection and Prevention Systems

    Get PDF
    Intrusion Detection and/or Prevention Systems (IDPS) represent an important line of defense against a variety of attacks that can compromise the security and proper functioning of an enterprise information system. IDPSs can be network or host-based and can collaborate in order to provide better detection of malicious traffic. Although several IDPS systems have been proposed, their appropriate con figuration and control for e effective detection/ prevention of attacks and efficient resource consumption is still far from trivial. Another concern is related to the slowing down of system performance when maximum security is applied, hence the need to trade o between security enforcement levels and the performance and usability of an enterprise information system. In this dissertation, we present a security management framework for the configuration and control of the security enforcement mechanisms of an enterprise information system. The approach leverages the dynamic adaptation of security measures based on the assessment of system vulnerability and threat prediction, and provides several levels of attack containment. Furthermore, we study the impact of security enforcement levels on the performance and usability of an enterprise information system. In particular, we analyze the impact of an IDPS con figuration on the resulting security of the network, and on the network performance. We also analyze the performance of the IDPS for different con figurations and under different traffic characteristics. The analysis can then be used to predict the impact of a given security con figuration on the prediction of the impact on network performance

    Libro de Actas JCC&BD 2018 : VI Jornadas de Cloud Computing & Big Data

    Get PDF
    Se recopilan las ponencias presentadas en las VI Jornadas de Cloud Computing & Big Data (JCC&BD), realizadas entre el 25 al 29 de junio de 2018 en la Facultad de Informática de la Universidad Nacional de La Plata.Universidad Nacional de La Plata (UNLP) - Facultad de Informátic

    Intelligent Computational Transportation

    Get PDF
    Transportation is commonplace around our world. Numerous researchers dedicate great efforts to vast transportation research topics. The purpose of this dissertation is to investigate and address a couple of transportation problems with respect to geographic discretization, pavement surface automatic examination, and traffic ow simulation, using advanced computational technologies. Many applications require a discretized 2D geographic map such that local information can be accessed efficiently. For example, map matching, which aligns a sequence of observed positions to a real-world road network, needs to find all the nearby road segments to the individual positions. To this end, the map is discretized by cells and each cell retains a list of road segments coincident with this cell. An efficient method is proposed to form such lists for the cells without costly overlapping tests. Furthermore, the method can be easily extended to 3D scenarios for fast triangle mesh voxelization. Pavement surface distress conditions are critical inputs for quantifying roadway infrastructure serviceability. Existing computer-aided automatic examination techniques are mainly based on 2D image analysis or 3D georeferenced data set. The disadvantage of information losses or extremely high costs impedes their effectiveness iv and applicability. In this study, a cost-effective Kinect-based approach is proposed for 3D pavement surface reconstruction and cracking recognition. Various cracking measurements such as alligator cracking, traverse cracking, longitudinal cracking, etc., are identified and recognized for their severity examinations based on associated geometrical features. Smart transportation is one of the core components in modern urbanization processes. Under this context, the Connected Autonomous Vehicle (CAV) system presents a promising solution towards the enhanced traffic safety and mobility through state-of-the-art wireless communications and autonomous driving techniques. Due to the different nature between the CAVs and the conventional Human- Driven-Vehicles (HDVs), it is believed that CAV-enabled transportation systems will revolutionize the existing understanding of network-wide traffic operations and re-establish traffic ow theory. This study presents a new continuum dynamics model for the future CAV-enabled traffic system, realized by encapsulating mutually-coupled vehicle interactions using virtual internal and external forces. A Smoothed Particle Hydrodynamics (SPH)-based numerical simulation and an interactive traffic visualization framework are also developed

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Implementazione ed ottimizzazione di algoritmi per l'analisi di Biomedical Big Data

    Get PDF
    Big Data Analytics poses many challenges to the research community who has to handle several computational problems related to the vast amount of data. An increasing interest involves Biomedical data, aiming to get the so-called personalized medicine, where therapy plans are designed on the specific genotype and phenotype of an individual patient and algorithm optimization plays a key role to this purpose. In this work we discuss about several topics related to Biomedical Big Data Analytics, with a special attention to numerical issues and algorithmic solutions related to them. We introduce a novel feature selection algorithm tailored on omics datasets, proving its efficiency on synthetic and real high-throughput genomic datasets. We tested our algorithm against other state-of-art methods obtaining better or comparable results. We also implemented and optimized different types of deep learning models, testing their efficiency on biomedical image processing tasks. Three novel frameworks for deep learning neural network models development are discussed and used to describe the numerical improvements proposed on various topics. In the first implementation we optimize two Super Resolution models showing their results on NMR images and proving their efficiency in generalization tasks without a retraining. The second optimization involves a state-of-art Object Detection neural network architecture, obtaining a significant speedup in computational performance. In the third application we discuss about femur head segmentation problem on CT images using deep learning algorithms. The last section of this work involves the implementation of a novel biomedical database obtained by the harmonization of multiple data sources, that provides network-like relationships between biomedical entities. Data related to diseases and other biological relates were mined using web-scraping methods and a novel natural language processing pipeline was designed to maximize the overlap between the different data sources involved in this project

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore