1,873 research outputs found

    Towards Structured Analysis of Broadcast Badminton Videos

    Full text link
    Sports video data is recorded for nearly every major tournament but remains archived and inaccessible to large scale data mining and analytics. It can only be viewed sequentially or manually tagged with higher-level labels which is time consuming and prone to errors. In this work, we propose an end-to-end framework for automatic attributes tagging and analysis of sport videos. We use commonly available broadcast videos of matches and, unlike previous approaches, does not rely on special camera setups or additional sensors. Our focus is on Badminton as the sport of interest. We propose a method to analyze a large corpus of badminton broadcast videos by segmenting the points played, tracking and recognizing the players in each point and annotating their respective badminton strokes. We evaluate the performance on 10 Olympic matches with 20 players and achieved 95.44% point segmentation accuracy, 97.38% player detection score ([email protected]), 97.98% player identification accuracy, and stroke segmentation edit scores of 80.48%. We further show that the automatically annotated videos alone could enable the gameplay analysis and inference by computing understandable metrics such as player's reaction time, speed, and footwork around the court, etc.Comment: 9 page

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Object detection for KRSBI robot soccer using PeleeNet on omnidirectional camera

    Get PDF
    Kontes Robot Sepak Bola Indonesia (KRSBI) is an annual event for contestants to compete their design and robot engineering in the field of robot soccer. Each contestant tries to win the match by scoring a goal toward the opponent's goal. In order to score a goal, the robot needs to find the ball, locate the goal, then kick the ball toward goal. We employed an omnidirectional vision camera as a visual sensor for a robot to perceive the object’s information. We calibrated streaming images from the camera to remove the mirror distortion. Furthermore, we deployed PeleeNet as our deep learning model for object detection. We fine-tuned PeleeNet on our dataset generated from our image collection. Our experiment result showed PeleeNet had the potential for deep learning mobile platform in KRSBI as the object detection architecture. It had a perfect combination of memory efficiency, speed and accuracy

    Online Visual Robot Tracking and Identification using Deep LSTM Networks

    Full text link
    Collaborative robots working on a common task are necessary for many applications. One of the challenges for achieving collaboration in a team of robots is mutual tracking and identification. We present a novel pipeline for online visionbased detection, tracking and identification of robots with a known and identical appearance. Our method runs in realtime on the limited hardware of the observer robot. Unlike previous works addressing robot tracking and identification, we use a data-driven approach based on recurrent neural networks to learn relations between sequential inputs and outputs. We formulate the data association problem as multiple classification problems. A deep LSTM network was trained on a simulated dataset and fine-tuned on small set of real data. Experiments on two challenging datasets, one synthetic and one real, which include long-term occlusions, show promising results.Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017. IROS RoboCup Best Paper Awar

    Size and Shape Determination of Riprap and Large-sized Aggregates Using Field Imaging

    Get PDF
    Riprap rock and large-sized aggregates are extensively used in transportation, geotechnical, and hydraulic engineering applications. Traditional methods for assessing riprap categories based on particle weight may involve subjective visual inspection and time-consuming manual measurements. Aggregate imaging and segmentation techniques can efficiently characterize riprap particles for their size and morphological/shape properties to estimate particle weights. Particle size and morphological/shape characterization ensure the reliable and sustainable use of all aggregate skeleton materials at quarry production lines and construction sites. Aggregate imaging systems developed to date for size and shape characterization, however, have primarily focused on measurement of separated or non-overlapping aggregate particles. This research study presents an innovative approach for automated segmentation and morphological analyses of stockpile aggregate images based on deep-learning techniques. As a project outcome, a portable, deployable, and affordable field-imaging system is envisioned to estimate volumes of individual riprap rocks for field evaluation. A state-of-the-art object detection and segmentation framework is used to train an image-segmentation kernel from manually labeled 2D riprap images in order to facilitate automatic and user-independent segmentation of stockpile aggregate images. The segmentation results show good agreement with ground-truth validation, which entailed comparing the manual labeling to the automatically segmented images. A significant improvement to the efficiency of size and morphological analyses conducted on densely stacked and overlapping particle images is achieved. The algorithms are integrated into a software application with a user-friendly Graphical User Interface (GUI) for ease of operation. Based on the findings of this study, this stockpile aggregate image analysis program promises to become an efficient and innovative application for field-scale and in-place evaluations of aggregate materials. The innovative imaging-based system is envisioned to provide convenient, reliable, and sustainable solutions for the on-site quality assurance/quality control (QA/QC) tasks related to riprap rock and large-sized aggregate material characterization and classification.IDOT-R27-182Ope

    Over speed detection using Artificial Intelligence

    Get PDF
    Over speeding is one of the most common traffic violations. Around 41 million people are issued speeding tickets each year in USA i.e one every second. Existing approaches to detect over- speeding are not scalable and require manual efforts. In this project, by the use of computer vision and artificial intelligence, I have tried to detect over speeding and report the violation to the law enforcement officer. It was observed that when predictions are done using YoloV3, we get the best results

    Application-aware optimization of Artificial Intelligence for deployment on resource constrained devices

    Get PDF
    Artificial intelligence (AI) is changing people's everyday life. AI techniques such as Deep Neural Networks (DNN) rely on heavy computational models, which are in principle designed to be executed on powerful HW platforms, such as desktop or server environments. However, the increasing need to apply such solutions in people's everyday life has encouraged the research for methods to allow their deployment on embedded, portable and stand-alone devices, such as mobile phones, which exhibit relatively low memory and computational resources. Such methods targets both the development of lightweight AI algorithms and their acceleration through dedicated HW. This thesis focuses on the development of lightweight AI solutions, with attention to deep neural networks, to facilitate their deployment on resource constrained devices. Focusing on the computer vision field, we show how putting together the self learning ability of deep neural networks with application-specific knowledge, in the form of feature engineering, it is possible to dramatically reduce the total memory and computational burden, thus allowing the deployment on edge devices. The proposed approach aims to be complementary to already existing application-independent network compression solutions. In this work three main DNN optimization goals have been considered: increasing speed and accuracy, allowing training at the edge, and allowing execution on a microcontroller. For each of these we deployed the resulting algorithm to the target embedded device and measured its performance

    Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB

    Full text link
    We propose a new single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose-maps (ORPM) which enable full body pose inference even under strong partial occlusions by other people and objects in the scene. ORPM outputs a fixed number of maps which encode the 3D joint locations of all people in the scene. Body part associations allow us to infer 3D pose for an arbitrary number of people without explicit bounding box prediction. To train our approach we introduce MuCo-3DHP, the first large scale training data set showing real images of sophisticated multi-person interactions and occlusions. We synthesize a large corpus of multi-person images by compositing images of individual people (with ground truth from mutli-view performance capture). We evaluate our method on our new challenging 3D annotated multi-person test set MuPoTs-3D where we achieve state-of-the-art performance. To further stimulate research in multi-person 3D pose estimation, we will make our new datasets, and associated code publicly available for research purposes.Comment: International Conference on 3D Vision (3DV), 201
    • …
    corecore