199 research outputs found

    Feature binding of MPEG-7 Visual Descriptors Using Chaotic Series

    Get PDF
    Due to advanced segmentation and tracking algorithms, a video can be divided into numerous objects. Segmentation and tracking algorithms output different low-level object features, resulting in a high-dimensional feature vector per object. The challenge is to generate feature vector of objects which can be mapped to human understandable description, such as object labels, e.g., person, car. MPEG-7 provides visual descriptors to describe video contents. However, generally the MPEG-7 visual descriptors are highly redundant, and the feature coefficients in these descriptors need to be pre-processed for domain specific application. Ideal case would be if MPEG-7 visual descriptor based feature vector, can be processed similar to some functional simulations of human brain activity. There has been a established link between the analysis of temporal human brain oscillatory signals and chaotic dynamics from the electroencephalography (EEG) of the brain neurons. Neural signals in limited brain activities are found to be behaviorally relevant (previously appeared to be noise) and can be simulated using chaotic series. Chaotic series is referred to as either a finite-difference or an ordinary differential equation, which presents non-random, irregular fluctuations of parameter values over time in a dynamical system. The dynamics in a chaotic series can be high - or low -dimensional, and the dimensionality can be deduced from the topological dimension of the attractor of the chaotic series. An attractor is manifested by the tendency of a non-linear finite difference equation or an ordinary differential equation, under various but delimited conditions, to go to a reproducible active state, and stay there. We propose a feature binding method, using chaotic series, to generate a new feature vector, C-MP7 , to describe video objects. The proposed method considers MPEG-7 visual descriptor coefficients as dynamical systems. Dynamical systems are excited (similar to neuronal excitation) with either high- or low-dimensional chaotic series, and then histogram-based clustering is applied on the simulated chaotic series coefficients to generate C-MP7 . The proposed feature binding offers better feature vector with high-dimensional chaotic series simulation than with low-dimensional chaotic series, over MPEG-7 visual descriptor based feature vector. Diverse video objects are grouped in four generic classes (e.g., has [barbelow]person, has [barbelow]group [barbelow]of [barbelow]persons, has [barbelow]vehicle, and has [barbelow]unknown ) to observe how well C-MP7 describes different video objects compared to MPEG-7 feature vector. In C-MP7 , with high dimensional chaotic series simulation, 1). descriptor coefficients are reduced dynamically up to 37.05% compared to 10% in MPEG-7 , 2) higher variance is achieved than MPEG-7 , 3) multi-class discriminant analysis of C-MP7 with Fisher-criteria shows increased binary class separation for clustered video objects than that of MPEG-7 , and 4) C-MP7 , specifically provides good clustering of video objects for has [barbelow]vehicle class against other classes. To test C-MP7 in an application, we deploy a combination of multiple binary classifiers for video object classification. Related work on video object classification use non-MPEG-7 features. We specifically observe classification of challenging surveillance video objects, e.g., incomplete objects, partial occlusion, background over lapping, scale and resolution variant objects, indoor / outdoor lighting variations. C-MP7 is used to train different classes of video objects. Object classification accuracy is verified with both low-dimensional and high-dimensional chaotic series based feature binding for C-MP7 . Testing of diverse video objects with high-dimensional chaotic series simulation shows, 1) classification accuracy significantly improves on average, 83% compared to the 62% with MPEG-7 , 2) excellent clustering of vehicle objects leads to above 99% accuracy for only vehicles against all other objects, and 3) with diverse video objects, including objects from poor segmentation. C-MP7 is more robust as a feature vector in classification than MPEG-7 . Initial results on sub-group classification for male and female video objects in has [barbelow]person class are also presentated as subjective observations. Earlier, chaos series properties have been used in video processing applications for compression and digital watermarking. To our best knowledge, this work is the first to use chaotic series for video object description and apply it for object classificatio

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications

    Shape Representation in Primate Visual Area 4 and Inferotemporal Cortex

    Get PDF
    The representation of contour shape is an essential component of object recognition, but the cortical mechanisms underlying it are incompletely understood, leaving it a fundamental open question in neuroscience. Such an understanding would be useful theoretically as well as in developing computer vision and Brain-Computer Interface applications. We ask two fundamental questions: “How is contour shape represented in cortex and how can neural models and computer vision algorithms more closely approximate this?” We begin by analyzing the statistics of contour curvature variation and develop a measure of salience based upon the arc length over which it remains within a constrained range. We create a population of V4-like cells – responsive to a particular local contour conformation located at a specific position on an object’s boundary – and demonstrate high recognition accuracies classifying handwritten digits in the MNIST database and objects in the MPEG-7 Shape Silhouette database. We compare the performance of the cells to the “shape-context” representation (Belongie et al., 2002) and achieve roughly comparable recognition accuracies using a small test set. We analyze the relative contributions of various feature sensitivities to recognition accuracy and robustness to noise. Local curvature appears to be the most informative for shape recognition. We create a population of IT-like cells, which integrate specific information about the 2-D boundary shapes of multiple contour fragments, and evaluate its performance on a set of real images as a function of the V4 cell inputs. We determine the sub-population of cells that are most effective at identifying a particular category. We classify based upon cell population response and obtain very good results. We use the Morris-Lecar neuronal model to more realistically illustrate the previously explored shape representation pathway in V4 – IT. We demonstrate recognition using spatiotemporal patterns within a winnerless competition network with FitzHugh-Nagumo model neurons. Finally, we use the Izhikevich neuronal model to produce an enhanced response in IT, correlated with recognition, via gamma synchronization in V4. Our results support the hypothesis that the response properties of V4 and IT cells, as well as our computer models of them, function as robust shape descriptors in the object recognition process

    Voice Modeling Methods for Automatic Speaker Recognition

    Get PDF
    Building a voice model means to capture the characteristics of a speaker´s voice in a data structure. This data structure is then used by a computer for further processing, such as comparison with other voices. Voice modeling is a vital step in the process of automatic speaker recognition that itself is the foundation of several applied technologies: (a) biometric authentication, (b) speech recognition and (c) multimedia indexing. Several challenges arise in the context of automatic speaker recognition. First, there is the problem of data shortage, i.e., the unavailability of sufficiently long utterances for speaker recognition. It stems from the fact that the speech signal conveys different aspects of the sound in a single, one-dimensional time series: linguistic (what is said?), prosodic (how is it said?), individual (who said it?), locational (where is the speaker?) and emotional features of the speech sound itself (to name a few) are contained in the speech signal, as well as acoustic background information. To analyze a specific aspect of the sound regardless of the other aspects, analysis methods have to be applied to a specific time scale (length) of the signal in which this aspect stands out of the rest. For example, linguistic information (i.e., which phone or syllable has been uttered?) is found in very short time spans of only milliseconds of length. On the contrary, speakerspecific information emerges the better the longer the analyzed sound is. Long utterances, however, are not always available for analysis. Second, the speech signal is easily corrupted by background sound sources (noise, such as music or sound effects). Their characteristics tend to dominate a voice model, if present, such that model comparison might then be mainly due to background features instead of speaker characteristics. Current automatic speaker recognition works well under relatively constrained circumstances, such as studio recordings, or when prior knowledge on the number and identity of occurring speakers is available. Under more adverse conditions, such as in feature films or amateur material on the web, the achieved speaker recognition scores drop below a rate that is acceptable for an end user or for further processing. For example, the typical speaker turn duration of only one second and the sound effect background in cinematic movies render most current automatic analysis techniques useless. In this thesis, methods for voice modeling that are robust with respect to short utterances and background noise are presented. The aim is to facilitate movie analysis with respect to occurring speakers. Therefore, algorithmic improvements are suggested that (a) improve the modeling of very short utterances, (b) facilitate voice model building even in the case of severe background noise and (c) allow for efficient voice model comparison to support the indexing of large multimedia archives. The proposed methods improve the state of the art in terms of recognition rate and computational efficiency. Going beyond selective algorithmic improvements, subsequent chapters also investigate the question of what is lacking in principle in current voice modeling methods. By reporting on a study with human probands, it is shown that the exclusion of time coherence information from a voice model induces an artificial upper bound on the recognition accuracy of automatic analysis methods. A proof-of-concept implementation confirms the usefulness of exploiting this kind of information by halving the error rate. This result questions the general speaker modeling paradigm of the last two decades and presents a promising new way. The approach taken to arrive at the previous results is based on a novel methodology of algorithm design and development called “eidetic design". It uses a human-in-the-loop technique that analyses existing algorithms in terms of their abstract intermediate results. The aim is to detect flaws or failures in them intuitively and to suggest solutions. The intermediate results often consist of large matrices of numbers whose meaning is not clear to a human observer. Therefore, the core of the approach is to transform them to a suitable domain of perception (such as, e.g., the auditory domain of speech sounds in case of speech feature vectors) where their content, meaning and flaws are intuitively clear to the human designer. This methodology is formalized, and the corresponding workflow is explicated by several use cases. Finally, the use of the proposed methods in video analysis and retrieval are presented. This shows the applicability of the developed methods and the companying software library sclib by means of improved results using a multimodal analysis approach. The sclib´s source code is available to the public upon request to the author. A summary of the contributions together with an outlook to short- and long-term future work concludes this thesis

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    State-of-the-Art Sensors Technology in Spain 2015: Volume 1

    Get PDF
    This book provides a comprehensive overview of state-of-the-art sensors technology in specific leading areas. Industrial researchers, engineers and professionals can find information on the most advanced technologies and developments, together with data processing. Further research covers specific devices and technologies that capture and distribute data to be processed by applying dedicated techniques or procedures, which is where sensors play the most important role. The book provides insights and solutions for different problems covering a broad spectrum of possibilities, thanks to a set of applications and solutions based on sensory technologies. Topics include: • Signal analysis for spectral power • 3D precise measurements • Electromagnetic propagation • Drugs detection • e-health environments based on social sensor networks • Robots in wireless environments, navigation, teleoperation, object grasping, demining • Wireless sensor networks • Industrial IoT • Insights in smart cities • Voice recognition • FPGA interfaces • Flight mill device for measurements on insects • Optical systems: UV, LEDs, lasers, fiber optics • Machine vision • Power dissipation • Liquid level in fuel tanks • Parabolic solar tracker • Force sensors • Control for a twin roto

    AXMEDIS 2007 Conference Proceedings

    Get PDF
    The AXMEDIS International Conference series has been established since 2005 and is focused on the research, developments and applications in the cross-media domain, exploring innovative technologies to meet the challenges of the sector. AXMEDIS2007 deals with all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, interoperability, protection and rights management. It addresses the latest developments and future trends of the technologies and their applications, their impact and exploitation within academic, business and industrial communities

    HPCCP/CAS Workshop Proceedings 1998

    Get PDF
    This publication is a collection of extended abstracts of presentations given at the HPCCP/CAS (High Performance Computing and Communications Program/Computational Aerosciences Project) Workshop held on August 24-26, 1998, at NASA Ames Research Center, Moffett Field, California. The objective of the Workshop was to bring together the aerospace high performance computing community, consisting of airframe and propulsion companies, independent software vendors, university researchers, and government scientists and engineers. The Workshop was sponsored by the HPCCP Office at NASA Ames Research Center. The Workshop consisted of over 40 presentations, including an overview of NASA's High Performance Computing and Communications Program and the Computational Aerosciences Project; ten sessions of papers representative of the high performance computing research conducted within the Program by the aerospace industry, academia, NASA, and other government laboratories; two panel sessions; and a special presentation by Mr. James Bailey

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described

    Cyber Security and Critical Infrastructures 2nd Volume

    Get PDF
    The second volume of the book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles, including an editorial that explains the current challenges, innovative solutions and real-world experiences that include critical infrastructure and 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems
    • …
    corecore