2,478 research outputs found

    Scalable Solutions for Automated Single Pulse Identification and Classification in Radio Astronomy

    Full text link
    Data collection for scientific applications is increasing exponentially and is forecasted to soon reach peta- and exabyte scales. Applications which process and analyze scientific data must be scalable and focus on execution performance to keep pace. In the field of radio astronomy, in addition to increasingly large datasets, tasks such as the identification of transient radio signals from extrasolar sources are computationally expensive. We present a scalable approach to radio pulsar detection written in Scala that parallelizes candidate identification to take advantage of in-memory task processing using Apache Spark on a YARN distributed system. Furthermore, we introduce a novel automated multiclass supervised machine learning technique that we combine with feature selection to reduce the time required for candidate classification. Experimental testing on a Beowulf cluster with 15 data nodes shows that the parallel implementation of the identification algorithm offers a speedup of up to 5X that of a similar multithreaded implementation. Further, we show that the combination of automated multiclass classification and feature selection speeds up the execution performance of the RandomForest machine learning algorithm by an average of 54% with less than a 2% average reduction in the algorithm's ability to correctly classify pulsars. The generalizability of these results is demonstrated by using two real-world radio astronomy data sets.Comment: In Proceedings of the 47th International Conference on Parallel Processing (ICPP 2018). ACM, New York, NY, USA, Article 11, 11 page

    12th SC@RUG 2015 proceedings:Student Colloquium 2014-2015

    Get PDF

    MS-ADS: multistage spectrogram image-based anomaly detection system for IoT security.

    Get PDF
    The innovative computing idea of Internet-of-Things (IoT) architecture has gained tremendous popularity over the last decade, resulting in an exponential increase in the connected devices and the data processed in the IoT networks. Since IoT devices collect a massive amount of sensitive information exchanged over the traditional internet, security has become a prime concern due to the more frequent generation of network anomalies. A network-based anomaly detection system can provide the much-needed efficient security solution to the IoT network by detecting anomalies at the network entry points through constant traffic monitoring. Despite enormous efforts by researchers, these detection systems still suffer from lower detection accuracy in detecting anomalies and generate a high false alarm rate and false-negative rate in classifying network traffic. To this end, this paper proposes an efficient Multistage Spectrogram image-based network Anomaly Detection System (MS-ADS) using a deep convolution neural network that utilizes a short-time Fourier Transform to transform flow features into spectrogram images. The results demonstrate that the proposed method achieves high detection accuracy of 99.98% with a reduction in the false alarm rate to 0.006% in classifying network traffic. Also, the proposed scheme improves predicting the anomaly instances by 0.75% to 4.82%, comparing the benchmark methodologies to exhibit its efficiency for the IoT network. To minimize the computational and training cost for the model re-training phase, the proposed solution demonstrates that only 40500 network flows from the dataset suffice to achieve a detection accuracy of 99.5%

    Efficient Image Gallery Representations at Scale Through Multi-Task Learning

    Full text link
    Image galleries provide a rich source of diverse information about a product which can be leveraged across many recommendation and retrieval applications. We study the problem of building a universal image gallery encoder through multi-task learning (MTL) approach and demonstrate that it is indeed a practical way to achieve generalizability of learned representations to new downstream tasks. Additionally, we analyze the relative predictive performance of MTL-trained solutions against optimal and substantially more expensive solutions, and find signals that MTL can be a useful mechanism to address sparsity in low-resource binary tasks.Comment: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieva
    • …
    corecore