34,793 research outputs found

    A Survey on Content-Aware Video Analysis for Sports

    Full text link
    Sports data analysis is becoming increasingly large-scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This study develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, the paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Holistic Parameteric Reconstruction of Building Models from Point Clouds

    Full text link
    Building models are conventionally reconstructed by building roof points planar segmentation and then using a topology graph to group the planes together. Roof edges and vertices are then mathematically represented by intersecting segmented planes. Technically, such solution is based on sequential local fitting, i.e., the entire data of one building are not simultaneously participating in determining the building model. As a consequence, the solution is lack of topological integrity and geometric rigor. Fundamentally different from this traditional approach, we propose a holistic parametric reconstruction method which means taking into consideration the entire point clouds of one building simultaneously. In our work, building models are reconstructed from predefined parametric (roof) primitives. We first use a well-designed deep neural network to segment and identify primitives in the given building point clouds. A holistic optimization strategy is then introduced to simultaneously determine the parameters of a segmented primitive. In the last step, the optimal parameters are used to generate a watertight building model in CityGML format. The airborne LiDAR dataset RoofN3D with predefined roof types is used for our test. It is shown that PointNet++ applied to the entire dataset can achieve an accuracy of 83% for primitive classification. For a subset of 910 buildings in RoofN3D, the holistic approach is then used to determine the parameters of primitives and reconstruct the buildings. The achieved overall quality of reconstruction is 0.08 meters for point-surface-distance or 0.7 times RMSE of the input LiDAR points. The study demonstrates the efficiency and capability of the proposed approach and its potential to handle large scale urban point clouds

    Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning

    Full text link
    This paper presents KeypointNet, an end-to-end geometric reasoning framework to learn an optimal set of category-specific 3D keypoints, along with their detectors. Given a single image, KeypointNet extracts 3D keypoints that are optimized for a downstream task. We demonstrate this framework on 3D pose estimation by proposing a differentiable objective that seeks the optimal set of keypoints for recovering the relative pose between two views of an object. Our model discovers geometrically and semantically consistent keypoints across viewing angles and instances of an object category. Importantly, we find that our end-to-end framework using no ground-truth keypoint annotations outperforms a fully supervised baseline using the same neural network architecture on the task of pose estimation. The discovered 3D keypoints on the car, chair, and plane categories of ShapeNet are visualized at http://keypointnet.github.io/

    Multi-agents Architecture for Semantic Retrieving Video in Distributed Environment

    Full text link
    This paper presents an integrated multi-agents architecture for indexing and retrieving video information.The focus of our work is to elaborate an extensible approach that gathers a priori almost of the mandatory tools which palliate to the major intertwining problems raised in the whole process of the video lifecycle (classification, indexing and retrieval). In fact, effective and optimal retrieval video information needs a collaborative approach based on multimodal aspects. Clearly, it must to take into account the distributed aspect of the data sources, the adaptation of the contents, semantic annotation, personalized request and active feedback which constitute the backbone of a vigorous system which improve its performances in a smart wayComment: 11 pages, 11 figures, The Proceeding of International Conference on Soft Computing and Software Engineering 201

    Machine Vision in the Context of Robotics: A Systematic Literature Review

    Full text link
    Machine vision is critical to robotics due to a wide range of applications which rely on input from visual sensors such as autonomous mobile robots and smart production systems. To create the smart homes and systems of tomorrow, an overview about current challenges in the research field would be of use to identify further possible directions, created in a systematic and reproducible manner. In this work a systematic literature review was conducted covering research from the last 10 years. We screened 172 papers from four databases and selected 52 relevant papers. While robustness and computation time were improved greatly, occlusion and lighting variance are still the biggest problems faced. From the number of recent publications, we conclude that the observed field is of relevance and interest to the research community. Further challenges arise in many areas of the field.Comment: 10 pages 5 figures, systematic literature stud

    Ground-based Observations of the Solar Sources of Space Weather (Invited Review)

    Full text link
    Monitoring of the Sun and its activity is a task of growing importance in the frame of space weather research and awareness. Major space weather disturbances at Earth have their origin in energetic outbursts from the Sun: solar flares, coronal mass ejections and associated solar energetic particles. In this review we discuss the importance and complementarity of ground-based and space-based observations for space weather studies. The main focus is drawn on ground-based observations in the visible range of the spectrum, in particular in the diagnostically manifold Hα\alpha spectral line, which enables us to detect and study solar flares, filaments, filament eruptions, and Moreton waves. Existing Hα\alpha networks such as the GONG and the Global High-Resolution Hα\alpha Network are discussed. As an example of solar observations from space weather research to operations, we present the system of real-time detection of Hα\alpha flares and filaments established at Kanzelh\"ohe Observatory (KSO; Austria) in the frame of the ESA Space Situational Awareness programme. During the evaluation period 7/2013 - 11/2015, KSO provided 3020 hours of real-time Hα\alpha observations at the SWE portal. In total, 824 Hα\alpha flares were detected and classified by the real-time detection system, including 174 events of Hα\alpha importance class 1 and larger. For the total sample of events, 95\% of the automatically determined flare peak times lie within ±\pm5 min of the values given in the official optical flares reports (by NOAA and KSO), and 76\% of the start times. The heliographic positions determined are better than ±\pm5^\circ. The probability of detection of flares of importance 1 or larger is 95\%, with a false alarm rate of 16\%. These numbers confirm the high potential of automatic flare detection and alerting from ground-based observatories.Comment: Accepted for "Ground-based Solar Observations in the Space Instrumentation Era", Proceedings of the Coimbra Solar Physics Meeting 2015, ASP Conference Series, Eds. I. Dorotovic, C. Fischer, and M. Temmer; 16p

    Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises

    Full text link
    The extensive use of social media platforms, especially during disasters, creates unique opportunities for humanitarian organizations to gain situational awareness and launch relief operations accordingly. In addition to the textual content, people post overwhelming amounts of imagery data on social networks within minutes of a disaster hit. Studies point to the importance of this online imagery content for emergency response. Despite recent advances in the computer vision field, automatic processing of the crisis-related social media imagery data remains a challenging task. It is because a majority of which consists of redundant and irrelevant content. In this paper, we present an image processing pipeline that comprises de-duplication and relevancy filtering mechanisms to collect and filter social media image content in real-time during a crisis event. Results obtained from extensive experiments on real-world crisis datasets demonstrate the significance of the proposed pipeline for optimal utilization of both human and machine computing resources.Comment: Accepted for publication in the 14th International Conference on Information Systems For Crisis Response and Management (ISCRAM), 201

    A state of the art of urban reconstruction: street, street network, vegetation, urban feature

    Full text link
    World population is raising, especially the part of people living in cities. With increased population and complex roles regarding their inhabitants and their surroundings, cities concentrate difficulties for design, planning and analysis. These tasks require a way to reconstruct/model a city. Traditionally, much attention has been given to buildings reconstruction, yet an essential part of city were neglected: streets. Streets reconstruction has been seldom researched. Streets are also complex compositions of urban features, and have a unique role for transportation (as they comprise roads). We aim at completing the recent state of the art for building reconstruction (Musialski2012) by considering all other aspect of urban reconstruction. We introduce the need for city models. Because reconstruction always necessitates data, we first analyse which data are available. We then expose a state of the art of street reconstruction, street network reconstruction, urban features reconstruction/modelling, vegetation , and urban objects reconstruction/modelling. Although reconstruction strategies vary widely, we can order them by the role the model plays, from data driven approach, to model-based approach, to inverse procedural modelling and model catalogue matching. The main challenges seems to come from the complex nature of urban environment and from the limitations of the available data. Urban features have strong relationships, between them, and to their surrounding, as well as in hierarchical relations. Procedural modelling has the power to express these relations, and could be applied to the reconstruction of urban features via the Inverse Procedural Modelling paradigm.Comment: Extracted from PhD (chap1

    Image Provenance Analysis at Scale

    Full text link
    Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation ( e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results.Comment: 13 pages, 6 figure
    corecore