2,801 research outputs found
Temporal Relational Reasoning in Videos
Temporal relational reasoning, the ability to link meaningful transformations
of objects or entities over time, is a fundamental property of intelligent
species. In this paper, we introduce an effective and interpretable network
module, the Temporal Relation Network (TRN), designed to learn and reason about
temporal dependencies between video frames at multiple time scales. We evaluate
TRN-equipped networks on activity recognition tasks using three recent video
datasets - Something-Something, Jester, and Charades - which fundamentally
depend on temporal relational reasoning. Our results demonstrate that the
proposed TRN gives convolutional neural networks a remarkable capacity to
discover temporal relations in videos. Through only sparsely sampled video
frames, TRN-equipped networks can accurately predict human-object interactions
in the Something-Something dataset and identify various human gestures on the
Jester dataset with very competitive performance. TRN-equipped networks also
outperform two-stream networks and 3D convolution networks in recognizing daily
activities in the Charades dataset. Further analyses show that the models learn
intuitive and interpretable visual common sense knowledge in videos.Comment: camera-ready version for ECCV'1
SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos
In this paper, we introduce SoccerNet, a benchmark for action spotting in
soccer videos. The dataset is composed of 500 complete soccer games from six
main European leagues, covering three seasons from 2014 to 2017 and a total
duration of 764 hours. A total of 6,637 temporal annotations are automatically
parsed from online match reports at a one minute resolution for three main
classes of events (Goal, Yellow/Red Card, and Substitution). As such, the
dataset is easily scalable. These annotations are manually refined to a one
second resolution by anchoring them at a single timestamp following
well-defined soccer rules. With an average of one event every 6.9 minutes, this
dataset focuses on the problem of localizing very sparse events within long
videos. We define the task of spotting as finding the anchors of soccer events
in a video. Making use of recent developments in the realm of generic action
recognition and detection in video, we provide strong baselines for detecting
soccer events. We show that our best model for classifying temporal segments of
length one minute reaches a mean Average Precision (mAP) of 67.8%. For the
spotting task, our baseline reaches an Average-mAP of 49.7% for tolerances
ranging from 5 to 60 seconds. Our dataset and models are available at
https://silviogiancola.github.io/SoccerNet.Comment: CVPR Workshop on Computer Vision in Sports 201
Early Churn Prediction from Large Scale User-Product Interaction Time Series
User churn, characterized by customers ending their relationship with a
business, has profound economic consequences across various
Business-to-Customer scenarios. For numerous system-to-user actions, such as
promotional discounts and retention campaigns, predicting potential churners
stands as a primary objective. In volatile sectors like fantasy sports,
unpredictable factors such as international sports events can influence even
regular spending habits. Consequently, while transaction history and
user-product interaction are valuable in predicting churn, they demand deep
domain knowledge and intricate feature engineering. Additionally, feature
development for churn prediction systems can be resource-intensive,
particularly in production settings serving 200m+ users, where inference
pipelines largely focus on feature engineering. This paper conducts an
exhaustive study on predicting user churn using historical data. We aim to
create a model forecasting customer churn likelihood, facilitating businesses
in comprehending attrition trends and formulating effective retention plans.
Our approach treats churn prediction as multivariate time series
classification, demonstrating that combining user activity and deep neural
networks yields remarkable results for churn prediction in complex
business-to-customer contexts.Comment: 12 pages, 3 tables, 8 figures, Accepted in ICML
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Applied image recognition: guidelines for using deep learning models in practice
In recent years, novel deep learning techniques, greater data availability, and a significant growth in computing powers have enabled AI researchers to tackle problems that had remained unassailable for many years. Furthermore, the advent of comprehensive AI frameworks offers the unique opportunity for adopting these new tools in applied fields. Information systems research can play a vital role in bridging the gap to practice. To this end, we conceptualize guidelines for applied image recognition spanning task definition, neural net configuration and training procedures. We showcase our guidelines by means of a biomedical research project for image recognition
Automated Detection of Dental Caries from Oral Images using Deep Convolutional Neural Networks
The urgent demand for accurate and efficient diagnostic methods to combat oral diseases, particularly dental caries, has led to the exploration of advanced techniques. Dental caries, caused by bacterial activities that weaken tooth enamel, can result in severe cavities and infections if not promptly treated. Despite existing imaging techniques, consistent and early diagnoses remain challenging. Traditional approaches, such as visual and tactile examinations, are prone to variations in expertise, necessitating more objective diagnostic tools. This study leverages deep learning to propose an explainable methodology for automated dental caries detection in images. Utilizing pre-trained convolutional neural networks (CNNs) including VGG-16, VGG-19, DenseNet-121, and Inception V3, we investigate different models and preprocessing techniques, such as histogram equalization and Sobel edge detection, to enhance the detection process. Our comprehensive experiments on a dataset of 884 oral images demonstrate the efficacy of the proposed approach in achieving accurate caries detection. Notably, the VGG-16 model achieves the best accuracy of 98.3% using the stochastic gradient descent (SGD) optimizer with Nesterov’s momentum. This research contributes to the field by introducing an interpretable deep learning-based solution for automated dental caries detection, enhancing diagnostic accuracy, and offering potential insights for dental health assessment
- …