3 research outputs found

    Grand Challenge: Real-time Destination and ETA Prediction for Maritime Traffic

    Full text link
    In this paper, we present our approach for solving the DEBS Grand Challenge 2018. The challenge asks to provide a prediction for (i) a destination and the (ii) arrival time of ships in a streaming-fashion using Geo-spatial data in the maritime context. Novel aspects of our approach include the use of ensemble learning based on Random Forest, Gradient Boosting Decision Trees (GBDT), XGBoost Trees and Extremely Randomized Trees (ERT) in order to provide a prediction for a destination while for the arrival time, we propose the use of Feed-forward Neural Networks. In our evaluation, we were able to achieve an accuracy of 97% for the port destination classification problem and 90% (in mins) for the ETA prediction

    Active replication for latency-sensitive stream processing in Apache Flink

    No full text
    Stream processing frameworks allow processing massive amounts of data shortly after it is produced, and enable a fast reaction to events in scenarios such as data center monitoring, smart transportation, or telecommunication networks. Many scenarios depend on the fast and reliable processing of incoming data, requiring low end-to-end latencies from the ingest of a new event to the corresponding output. The occurrence of faults jeopardizes these guarantees: Currently- leading high-availability solutions for stream processing such as Spark Streaming or Apache Flink’s implement passive replication through snapshotting, requiring a stop-the-world operation to recover from a failure. Active replication, while incurring higher deployment costs, can overcome these limitations and allow to mask the impact of faults and match stringent end-to-end latency requirements. We present the design, implementation, and evaluation of active replication in the popular Apache Flink platform. Our study explores two alternative designs, a leader-based approach leveraging external services (Kafka and ZooKeeper) and a leaderless implementation leveraging a novel deterministic merging algorithm. Our evaluation using a series of microbenchmarks and a SaaS cloud monitoring scenario on a 37-server cluster show that the actively-replicated Flink can fully mask the impact of faults on end-to-end latency

    The DEBS 2019 Grand Challenge

    No full text
    The ACM DEBS 2019 Grand Challenge is the ninth in a series of challenges which seek to provide a common ground and evaluation criteria for a competition aimed at both research and industrial event-based systems. The focus of the 2019 Grand Challenge is on the application of machine learning to LiDAR data. The goal of the challenge is to perform classification of objects found in urban environments and sensed in several 3D scenes by the LiDAR. The applications of LIDAR and object detection go well beyond autonomous vehicles and are suitable for use in agriculture, waterway maintenance and flood prevention, and construction. This paper describes the specifics of the data streams provided in the challenge as well as the benchmarking platform that supports the testing of corresponding solutions
    corecore