38 research outputs found

    Integrating Data Science and Earth Science

    Get PDF
    This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Tight Arms Race: Overview of Current Malware Threats and Trends in Their Detection

    Get PDF
    Cyber attacks are currently blooming, as the attackers reap significant profits from them and face a limited risk when compared to committing the "classical" crimes. One of the major components that leads to the successful compromising of the targeted system is malicious software. It allows using the victim's machine for various nefarious purposes, e.g., making it a part of the botnet, mining cryptocurrencies, or holding hostage the data stored there. At present, the complexity, proliferation, and variety of malware pose a real challenge for the existing countermeasures and require their constant improvements. That is why, in this paper we first perform a detailed meta-review of the existing surveys related to malware and its detection techniques, showing an arms race between these two sides of a barricade. On this basis, we review the evolution of modern threats in the communication networks, with a particular focus on the techniques employing information hiding. Next, we present the bird's eye view portraying the main development trends in detection methods with a special emphasis on the machine learning techniques. The survey is concluded with the description of potential future research directions in the field of malware detection

    Technologies for a FAIRer use of Ocean Best Practices

    Get PDF
    The publication and dissemination of best practices in ocean observing is pivotal for multiple aspects of modern marine science, including cross-disciplinary interoperability, improved reproducibility of observations and analyses, and training of new practitioners. Often, best practices are not published in a scientific journal and may not even be formally documented, residing solely within the minds of individuals who pass the information along through direct instruction. Naturally, documenting best practices is essential to accelerate high-quality marine science; however, documentation in a drawer has little impact. To enhance the application and development of best practices, we must leverage contemporary document handling technologies to make best practices discoverable, accessible, and interlinked, echoing the logic of the FAIR data principles [1]

    Proceedings of the 10th International Conference on Ecological Informatics: translating ecological data into knowledge and decisions in a rapidly changing world: ICEI 2018

    Get PDF
    The Conference Proceedings are an impressive display of the current scope of Ecological Informatics. Whilst Data Management, Analysis, Synthesis and Forecasting have been lasting popular themes over the past nine biannual ICEI conferences, ICEI 2018 addresses distinctively novel developments in Data Acquisition enabled by cutting edge in situ and remote sensing technology. The here presented ICEI 2018 abstracts captures well current trends and challenges of Ecological Informatics towards: • regional, continental and global sharing of ecological data, • thorough integration of complementing monitoring technologies including DNA-barcoding, • sophisticated pattern recognition by deep learning, • advanced exploration of valuable information in ‘big data’ by means of machine learning and process modelling, • decision-informing solutions for biodiversity conservation and sustainable ecosystem management in light of global changes

    Proceedings of the 10th International Conference on Ecological Informatics: translating ecological data into knowledge and decisions in a rapidly changing world: ICEI 2018

    Get PDF
    The Conference Proceedings are an impressive display of the current scope of Ecological Informatics. Whilst Data Management, Analysis, Synthesis and Forecasting have been lasting popular themes over the past nine biannual ICEI conferences, ICEI 2018 addresses distinctively novel developments in Data Acquisition enabled by cutting edge in situ and remote sensing technology. The here presented ICEI 2018 abstracts captures well current trends and challenges of Ecological Informatics towards: • regional, continental and global sharing of ecological data, • thorough integration of complementing monitoring technologies including DNA-barcoding, • sophisticated pattern recognition by deep learning, • advanced exploration of valuable information in ‘big data’ by means of machine learning and process modelling, • decision-informing solutions for biodiversity conservation and sustainable ecosystem management in light of global changes

    Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems

    Get PDF
    Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains
    corecore