383 research outputs found
An Overview of Multimodal Techniques for the Characterization of Sport Programmes
The problem of content characterization of sports videos is of great interest because sports video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of sports videos. We focus this analysis on the typology of the signal (audio, video, text captions, ...) from which the low-level features are extracted. First we consider the techniques based on visual information, then the methods based on audio information, and finally the algorithms based on audio-visual cues, used in a multi-modal fashion. This analysis shows that each type of signal carries some peculiar information, and the multi-modal approach can fully exploit the multimedia information associated to the sports video. Moreover, we observe that the characterization is performed either considering what happens in a specific time segment, observing therefore the features in a "static" way, or trying to capture their "dynamic" evolution in time. The effectiveness of each approach depends mainly on the kind of sports it relates to, and the type of highlights we are focusing on
Semantic Indexing of Sport Program Sequences by Audio-Visual Analysis
Semantic indexing of sports videos is a subject of great interest to researchers working on multimedia content characterization. Sports programs appeal to large audiences and their efficient distribution over various networks should contribute to widespread usage of
multimedia services. In this paper, we propose a semantic indexing algorithm for soccer programs which uses both audio and visual
information for content characterization. The video signal is processed first by extracting low-level visual descriptors from the MPEG compressed bit-stream. The temporal evolution of these descriptors during a semantic event is supposed to be governed by a controlled Markov chain. This allows to determine a list of those video segments where a semantic event of interest is likely to be found, based on the maximum likelihood criterion. The audio information is then used to refine the results of the video classification procedure by ranking the candidate video segments in the list so that the segments associated to the event of interest appear in the very first positions of the ordered list. The proposed method is applied to goal detection. Experimental results show the effectiveness of the proposed cross-modal approach
VIDEO SCENE DETECTION USING CLOSED CAPTION TEXT
Issues in Automatic Video Biography Editing are similar to those in Video Scene Detection and Topic Detection and Tracking (TDT). The techniques of Video Scene Detection and TDT can be applied to interviews to reduce the time necessary to edit a video biography. The system has attacked the problems of extraction of video text, story segmentation, and correlation. This thesis project was divided into three parts: extraction, scene detection, and correlation. The project successfully detected scene breaks in series television episodes and displayed scenes that had similar content
Recommended from our members
User-centred video abstraction
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonThe rapid growth of digital video content in recent years has imposed the need for the development of technologies with the capability to produce condensed but semantically rich versions of the input video stream in an effective manner. Consequently, the topic of Video Summarisation is becoming increasingly popular in multimedia community and numerous video abstraction approaches have been proposed accordingly. These recommended techniques can be divided into two major categories of automatic and semi-automatic in accordance with the required level of human intervention in summarisation process. The fully-automated methods mainly adopt the low-level visual, aural and textual features alongside the mathematical and statistical algorithms in furtherance to extract the most significant segments of original video. However, the effectiveness of this type of techniques is restricted by a number of factors such as domain-dependency, computational expenses and the inability to understand the semantics of videos from low-level features. The second category of techniques however, attempts to alleviate the quality of summaries by involving humans in the abstraction process to bridge the semantic gap. Nonetheless, a single user’s subjectivity and other external contributing factors such as distraction will potentially deteriorate the performance of this group of approaches. Accordingly, in this thesis we have focused on the development of three user-centred effective video summarisation techniques that could be applied to different video categories and generate satisfactory results. According to our first proposed approach, a novel mechanism for a user-centred video summarisation has been presented for the scenarios in which multiple actors are employed in the video summarisation process in order to minimise the negative effects of sole user adoption. Based on our recommended algorithm, the video frames were initially scored by a group of video annotators ‘on the fly’. This was followed by averaging these assigned scores in order to generate a singular saliency score for each video frame and, finally, the highest scored video frames alongside the corresponding audio and textual contents were extracted to be included into the final summary. The effectiveness of our approach has been assessed by comparing the video summaries generated based on our approach against the results obtained from three existing automatic summarisation tools that adopt different modalities for abstraction purposes. The experimental results indicated that our proposed method is capable of delivering remarkable outcomes in terms of Overall Satisfaction and Precision with an acceptable Recall rate, indicating the usefulness of involving user input in the video summarisation process. In an attempt to provide a better user experience, we have proposed our personalised video summarisation method with an ability to customise the generated summaries in accordance with the viewers’ preferences. Accordingly, the end-user’s priority levels towards different video scenes were captured and utilised for updating the average scores previously assigned by the video annotators. Finally, our earlier proposed summarisation method was adopted to extract the most significant audio-visual content of the video. Experimental results indicated the capability of this approach to deliver superior outcomes compared with our previously proposed method and the three other automatic summarisation tools. Finally, we have attempted to reduce the required level of audience involvement for personalisation purposes by proposing a new method for producing personalised video summaries. Accordingly, SIFT visual features were adopted to identify the video scenes’ semantic categories. Fusing this retrieved data with pre-built users’ profiles, personalised video abstracts can be created. Experimental results showed the effectiveness of this method in delivering superior outcomes comparing to our previously recommended algorithm and the three other automatic summarisation techniques
Acceleration profiles of adolescent soccer players across a season
The injury risk inherent to soccer can be affected by external training loads and intrinsic factors. These intrinsic factors (sex, mass, strength, coordination, etc.) in young athletes can be rapidly altered the near their peak height velocity (PHV) during puberty, modifying their movement complexity and, potentially, their injury risk. While quantification of movement complexity through multiscale entropy analysis have been used in past biomechanical investigations, no studies have incorporated this analysis on tibial accelerometry signals collected in these maturing athletes. The purpose of this study is to collect tibial acceleration data from youth soccer athletes during several discrete drills and determine discrete acceleration metrics or signal complexity differs across athletes based on their relation to PHV, sex, or over the course of a season. Some limited significant time fixed effects on tibial movement complexity were found during only two drills in our protocol, while several drills showed significant effects for PHV, sex, and time on acceleration peaks and integrals. However, in the case of both complexity and discrete acceleration statistical analyses, subsequent model performance and comparison to the null model suggests that the predictive power of our independent variables is limited in these contexts. The findings of this study lay the groundwork for future research examining tibial acceleration signals as they relate to external loading complexity and magnitude within the lower-extremities
Audio-visual football video analysis, from structure detection to attention analysis
Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics.
For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification.
Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection.
An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains.
Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA
Analysis of the backpack loading efects on the human gait
Gait is a simple activity of daily life and one of the main abilities of the human being. Often during leisure, labour and sports activities, loads are carried over (e.g. backpack) during gait. These circumstantial loads can generate instability and increase biomechanicalstress over the human tissues and systems, especially on the locomotor, balance and postural regulation systems. According to Wearing (2006), subjects that carry a transitory or intermittent load will be able to find relatively efficient solutions to compensate its effects.info:eu-repo/semantics/publishedVersio
ENHANCING THE OPERATIONAL RESILIENCE OF CYBER- MANUFACTURING SYSTEMS (CMS) AGAINST CYBER-ATTACKS
Cyber-manufacturing systems (CMS) are interconnected production environments comprised of complex and networked cyber-physical systems (CPS) that can be instantiated across one or many locations. However, this vision of manufacturing environments ushers in the challenge of addressing new security threats to production systems that still contain traditional closed legacy elements. The widespread adoption of CMS has come with a dramatic increase in successful cyber-attacks. With a myriad of new targets and vulnerabilities, hackers have been able to cause significant economic losses by disrupting manufacturing operations, reducing outgoing product quality, and altering product designs. This research aims to contribute to the design of more resilient cyber-manufacturing systems. Traditional cybersecurity mechanisms focus on preventing the occurrence of cyber-attacks, improving the accuracy of detection, and increasing the speed of recovery. More often neglected is addressing how to respond to a successful attack during the time from the attack onset until the system recovery. We propose a novel approach that correlates the state of production and the timing of the attack to predict the effect on the manufacturing key performance indicators. Then a real-time decision strategy is deployed to select the appropriate response to maintain availability, utilization efficiency, and a quality ratio above degradation thresholds until recovery. Our goal is to demonstrate that the operational resilience of CMS can be enhanced such that the system will be able to withstand the advent of cyber-attacks while remaining operationally resilient. This research presents a novel framework to enhance the operational resilience of cyber-manufacturing systems against cyber-attacks. In contrast to other CPS where the general goal of operational resilience is to maintain a certain target level of availability, we propose a manufacturing-centric approach in which we utilize production key performance indicators as targets. This way we adopt a decision-making process for security in a way that is aligned with the operational strategy and bound to the socio-economic constraints inherent to manufacturing. Our proposed framework consists of four steps: 1) Identify: map CMS production goals, vulnerabilities, and resilience-enhancing mechanisms; 2) Establish: set targets of performance in production output, scrap rate, and downtime at different states; 3) Select: determine which mechanisms are needed and their triggering strategy, and 4) Deploy: integrate into the operation of the CMS the selected mechanisms, threat severity evaluation, and activation strategy. Lastly, we demonstrate via experimentation on a CMS testbed that this framework can effectively enhance the operational resilience of a CMS against a known cyber-attack
- …