2,801 research outputs found

    Temporal Relational Reasoning in Videos

    Full text link
    Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species. In this paper, we introduce an effective and interpretable network module, the Temporal Relation Network (TRN), designed to learn and reason about temporal dependencies between video frames at multiple time scales. We evaluate TRN-equipped networks on activity recognition tasks using three recent video datasets - Something-Something, Jester, and Charades - which fundamentally depend on temporal relational reasoning. Our results demonstrate that the proposed TRN gives convolutional neural networks a remarkable capacity to discover temporal relations in videos. Through only sparsely sampled video frames, TRN-equipped networks can accurately predict human-object interactions in the Something-Something dataset and identify various human gestures on the Jester dataset with very competitive performance. TRN-equipped networks also outperform two-stream networks and 3D convolution networks in recognizing daily activities in the Charades dataset. Further analyses show that the models learn intuitive and interpretable visual common sense knowledge in videos.Comment: camera-ready version for ECCV'1

    SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

    Full text link
    In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The dataset is composed of 500 complete soccer games from six main European leagues, covering three seasons from 2014 to 2017 and a total duration of 764 hours. A total of 6,637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (Goal, Yellow/Red Card, and Substitution). As such, the dataset is easily scalable. These annotations are manually refined to a one second resolution by anchoring them at a single timestamp following well-defined soccer rules. With an average of one event every 6.9 minutes, this dataset focuses on the problem of localizing very sparse events within long videos. We define the task of spotting as finding the anchors of soccer events in a video. Making use of recent developments in the realm of generic action recognition and detection in video, we provide strong baselines for detecting soccer events. We show that our best model for classifying temporal segments of length one minute reaches a mean Average Precision (mAP) of 67.8%. For the spotting task, our baseline reaches an Average-mAP of 49.7% for tolerances δ\delta ranging from 5 to 60 seconds. Our dataset and models are available at https://silviogiancola.github.io/SoccerNet.Comment: CVPR Workshop on Computer Vision in Sports 201

    Early Churn Prediction from Large Scale User-Product Interaction Time Series

    Full text link
    User churn, characterized by customers ending their relationship with a business, has profound economic consequences across various Business-to-Customer scenarios. For numerous system-to-user actions, such as promotional discounts and retention campaigns, predicting potential churners stands as a primary objective. In volatile sectors like fantasy sports, unpredictable factors such as international sports events can influence even regular spending habits. Consequently, while transaction history and user-product interaction are valuable in predicting churn, they demand deep domain knowledge and intricate feature engineering. Additionally, feature development for churn prediction systems can be resource-intensive, particularly in production settings serving 200m+ users, where inference pipelines largely focus on feature engineering. This paper conducts an exhaustive study on predicting user churn using historical data. We aim to create a model forecasting customer churn likelihood, facilitating businesses in comprehending attrition trends and formulating effective retention plans. Our approach treats churn prediction as multivariate time series classification, demonstrating that combining user activity and deep neural networks yields remarkable results for churn prediction in complex business-to-customer contexts.Comment: 12 pages, 3 tables, 8 figures, Accepted in ICML

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    Applied image recognition: guidelines for using deep learning models in practice

    Get PDF
    In recent years, novel deep learning techniques, greater data availability, and a significant growth in computing powers have enabled AI researchers to tackle problems that had remained unassailable for many years. Furthermore, the advent of comprehensive AI frameworks offers the unique opportunity for adopting these new tools in applied fields. Information systems research can play a vital role in bridging the gap to practice. To this end, we conceptualize guidelines for applied image recognition spanning task definition, neural net configuration and training procedures. We showcase our guidelines by means of a biomedical research project for image recognition

    Automated Detection of Dental Caries from Oral Images using Deep Convolutional Neural Networks

    Get PDF
    The urgent demand for accurate and efficient diagnostic methods to combat oral diseases, particularly dental caries, has led to the exploration of advanced techniques. Dental caries, caused by bacterial activities that weaken tooth enamel, can result in severe cavities and infections if not promptly treated. Despite existing imaging techniques, consistent and early diagnoses remain challenging. Traditional approaches, such as visual and tactile examinations, are prone to variations in expertise, necessitating more objective diagnostic tools. This study leverages deep learning to propose an explainable methodology for automated dental caries detection in images. Utilizing pre-trained convolutional neural networks (CNNs) including VGG-16, VGG-19, DenseNet-121, and Inception V3, we investigate different models and preprocessing techniques, such as histogram equalization and Sobel edge detection, to enhance the detection process. Our comprehensive experiments on a dataset of 884 oral images demonstrate the efficacy of the proposed approach in achieving accurate caries detection. Notably, the VGG-16 model achieves the best accuracy of 98.3% using the stochastic gradient descent (SGD) optimizer with Nesterov’s momentum. This research contributes to the field by introducing an interpretable deep learning-based solution for automated dental caries detection, enhancing diagnostic accuracy, and offering potential insights for dental health assessment
    • …
    corecore