2,106 research outputs found

    An Introduction to Conditional Random Fields

    Get PDF

    Interactive energy minimizing segmentation frameworks

    Get PDF
    [no abstract

    Continuous Modeling of 3D Building Rooftops From Airborne LIDAR and Imagery

    Get PDF
    In recent years, a number of mega-cities have provided 3D photorealistic virtual models to support the decisions making process for maintaining the cities' infrastructure and environment more effectively. 3D virtual city models are static snap-shots of the environment and represent the status quo at the time of their data acquisition. However, cities are dynamic system that continuously change over time. Accordingly, their virtual representation need to be regularly updated in a timely manner to allow for accurate analysis and simulated results that decisions are based upon. The concept of "continuous city modeling" is to progressively reconstruct city models by accommodating their changes recognized in spatio-temporal domain, while preserving unchanged structures. However, developing a universal intelligent machine enabling continuous modeling still remains a challenging task. Therefore, this thesis proposes a novel research framework for continuously reconstructing 3D building rooftops using multi-sensor data. For achieving this goal, we first proposes a 3D building rooftop modeling method using airborne LiDAR data. The main focus is on the implementation of an implicit regularization method which impose a data-driven building regularity to noisy boundaries of roof planes for reconstructing 3D building rooftop models. The implicit regularization process is implemented in the framework of Minimum Description Length (MDL) combined with Hypothesize and Test (HAT). Secondly, we propose a context-based geometric hashing method to align newly acquired image data with existing building models. The novelty is the use of context features to achieve robust and accurate matching results. Thirdly, the existing building models are refined by newly proposed sequential fusion method. The main advantage of the proposed method is its ability to progressively refine modeling errors frequently observed in LiDAR-driven building models. The refinement process is conducted in the framework of MDL combined with HAT. Markov Chain Monte Carlo (MDMC) coupled with Simulated Annealing (SA) is employed to perform a global optimization. The results demonstrates that the proposed continuous rooftop modeling methods show a promising aspects to support various critical decisions by not only reconstructing 3D rooftop models accurately, but also by updating the models using multi-sensor data

    Rich probabilistic models for semantic labeling

    Get PDF
    Das Ziel dieser Monographie ist es die Methoden und Anwendungen des semantischen Labelings zu erforschen. Unsere Beiträge zu diesem sich rasch entwickelten Thema sind bestimmte Aspekte der Modellierung und der Inferenz in probabilistischen Modellen und ihre Anwendungen in den interdisziplinären Bereichen der Computer Vision sowie medizinischer Bildverarbeitung und Fernerkundung

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA
    corecore