39 research outputs found

    Computer Vision from Spatial-Multiplexing Cameras at Low Measurement Rates

    Get PDF
    abstract: In UAVs and parking lots, it is typical to first collect an enormous number of pixels using conventional imagers. This is followed by employment of expensive methods to compress by throwing away redundant data. Subsequently, the compressed data is transmitted to a ground station. The past decade has seen the emergence of novel imagers called spatial-multiplexing cameras, which offer compression at the sensing level itself by providing an arbitrary linear measurements of the scene instead of pixel-based sampling. In this dissertation, I discuss various approaches for effective information extraction from spatial-multiplexing measurements and present the trade-offs between reliability of the performance and computational/storage load of the system. In the first part, I present a reconstruction-free approach to high-level inference in computer vision, wherein I consider the specific case of activity analysis, and show that using correlation filters, one can perform effective action recognition and localization directly from a class of spatial-multiplexing cameras, called compressive cameras, even at very low measurement rates of 1\%. In the second part, I outline a deep learning based non-iterative and real-time algorithm to reconstruct images from compressively sensed (CS) measurements, which can outperform the traditional iterative CS reconstruction algorithms in terms of reconstruction quality and time complexity, especially at low measurement rates. To overcome the limitations of compressive cameras, which are operated with random measurements and not particularly tuned to any task, in the third part of the dissertation, I propose a method to design spatial-multiplexing measurements, which are tuned to facilitate the easy extraction of features that are useful in computer vision tasks like object tracking. The work presented in the dissertation provides sufficient evidence to high-level inference in computer vision at extremely low measurement rates, and hence allows us to think about the possibility of revamping the current day computer systems.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Enhancing the performance of spread spectrum techniques in different applications

    Get PDF
    Spread spectrum, Automotive Radar, Indoor Positioning Systems, Ultrasonic and Microwave Imaging, super resolution technique and wavelet transformMagdeburg, Univ., Fak. für Elektrotechnik und Informationstechnik, Diss., 2006von Omar Abdel-Gaber Mohamed Al

    Feature based dynamic intra-video indexing

    Get PDF
    A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

    Acoustic classification of Australian frogs for ecosystem survey

    Get PDF
    Novel bioacoustics signal processing techniques have been developed to classify frog vocalisations in both trophy and field recordings. The research is useful in helping ecologists monitor frog community activity and species richness over long-term. Two major contributions are the construction of novel feature descriptors in the Cepstral domain, and the design of novel classification systems for multiple simultaneously vocalising frog species

    Condition monitoring of belt based motion transmission systems

    Get PDF
    A key asset of Royal Mail Group consists of a nationwide network of sorting offices that forms a constituent component of the means through which the organisation provides an efficient nationwide postal service within the United Kingdom. It may be argued that the efficiency currently possessed by modem sorting offices is due to the utilisation of machines that automate the process of sorting items of mail. The modem letter-sorting machine possessed by Royal Mail can sort up to 30,000 letters per hour; such machines serve as an example of an achievement of the application of Mechatronics. The maintenance of letter sorting machines constitutes a large overhead for the organisation. In the face of competition from pervasive electronic media within the personal communications market and the prospect of deregulation, Royal Mail seeks to streamline its operation in part by the reduction of the overheads incurred through maintenance of letter sorting machinery. The adoption of condition based maintenance techniques and predictive maintenance, for letter sorting machine components such as belts and bearings, forms part of the strategy through which Royal Mail seeks to reduce this overhead. Utilisation of flat belts and timing belts for the implementation of key functions in letter sorting machinery, such as the transportation of items of mail within the mail sorting process, results in the use of many such components within letter sorting machinery. A direct link exists between the maintenance of peak performance of a sorting machine and the maintenance of belt drives; as such the maintenance of belt drives forms a substantial component of the maintenance overhead. The focus of this thesis consists of the condition monitoring of belt based motion transmission systems and in particular, flat belts. The research that forms the basis of this thesis consists of three elements. Firstly, consideration of current knowledge of belt based power transmission such as knowledge of the mechanics of the belt based power transmission process within the context of condition monitoring... [cont'd

    Sparse and low rank approximations for action recognition

    Get PDF
    Action recognition is crucial area of research in computer vision with wide range of applications in surveillance, patient-monitoring systems, video indexing, Human- Computer Interaction and many more. These applications require automated action recognition. Robust classification methods are sought-after despite influential research in this field over past decade. The data resources have grown tremendously owing to the advances in the digital revolution which cannot be compared to the meagre resources in the past. The main limitation on a system when dealing with video data is the computational burden due to large dimensions and data redundancy. Sparse and low rank approximation methods have evolved recently which aim at concise and meaningful representation of data. This thesis explores the application of sparse and low rank approximation methods in the context of video data classification with the following contributions. 1. An approach for solving the problem of action and gesture classification is proposed within the sparse representation domain, effectively dealing with large feature dimensions, 2. Low rank matrix completion approach is proposed to jointly classify more than one action 3. Deep features are proposed for robust classification of multiple actions within matrix completion framework which can handle data deficiencies. This thesis starts with the applicability of sparse representations based classifi- cation methods to the problem of action and gesture recognition. Random projection is used to reduce the dimensionality of the features. These are referred to as compressed features in this thesis. The dictionary formed with compressed features has proved to be efficient for the classification task achieving comparable results to the state of the art. Next, this thesis addresses the more promising problem of simultaneous classifi- cation of multiple actions. This is treated as matrix completion problem under transduction setting. Matrix completion methods are considered as the generic extension to the sparse representation methods from compressed sensing point of view. The features and corresponding labels of the training and test data are concatenated and placed as columns of a matrix. The unknown test labels would be the missing entries in that matrix. This is solved using rank minimization techniques based on the assumption that the underlying complete matrix would be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities. This thesis then extends the matrix completion framework for joint classification of actions to handle the missing features besides missing test labels. In this context, deep features from a convolutional neural network are proposed. A convolutional neural network is trained on the training data and features are extracted from train and test data from the trained network. The performance of the deep features has proved to be promising when compared to the state of the art hand-crafted features

    Automatic Frog Calls Monitoring a Machine Learning Approach

    Get PDF
    Automatic recognition of frog vocalization is considered a valuable tool for a variety of biological research and environmental monitoring applications. This thesis proposes to develop an automatic, unattended monitoring system which can recognize the vocalizations of four species of frogs in the State of Oklahoma and can identify different individuals within the species of interest. The proposed monitoring system deployed one directional microphone to record the frog calls in the field continuously. Sound signals were stored in digital audiotape first and then transmitted into a PC with WAVE file fonnat. Species identification was perfonned first with the proposed filtering and grouping algorithm. Individual identification, which can detect different individual frogs within the same species, was perfonned in the second stage. Digital signal pre-processing, feature extraction, feature vector dimension reduction and pattern classification were perfonned step by step in this stage. Different feature extraction algorithms, induding the time domain method (Linear Predictive Coding), the frequency domain method (Time-Dependent Fourier transform), and time-scale domain method (Wavelet Packet Transfonn), and two different dimension reduction algorithms are synergistically integrated to produce final feature vectors which were to be fed into a neural network classifier. The simulation results show the promising future of deploying an array of continuous, on-line environmental monitoring systems based upon nonintrusive analysis of animal calls

    Human skill capturing and modelling using wearable devices

    Get PDF
    Industrial robots are delivering more and more manipulation services in manufacturing. However, when the task is complex, it is difficult to programme a robot to fulfil all the requirements because even a relatively simple task such as a peg-in-hole insertion contains many uncertainties, e.g. clearance, initial grasping position and insertion path. Humans, on the other hand, can deal with these variations using their vision and haptic feedback. Although humans can adapt to uncertainties easily, most of the time, the skilled based performances that relate to their tacit knowledge cannot be easily articulated. Even though the automation solution may not fully imitate human motion since some of them are not necessary, it would be useful if the skill based performance from a human could be firstly interpreted and modelled, which will then allow it to be transferred to the robot. This thesis aims to reduce robot programming efforts significantly by developing a methodology to capture, model and transfer the manual manufacturing skills from a human demonstrator to the robot. Recently, Learning from Demonstration (LfD) is gaining interest as a framework to transfer skills from human teacher to robot using probability encoding approaches to model observations and state transition uncertainties. In close or actual contact manipulation tasks, it is difficult to reliabley record the state-action examples without interfering with the human senses and activities. Therefore, wearable sensors are investigated as a promising device to record the state-action examples without restricting the human experts during the skilled execution of their tasks. Firstly to track human motions accurately and reliably in a defined 3-dimensional workspace, a hybrid system of Vicon and IMUs is proposed to compensate for the known limitations of the individual system. The data fusion method was able to overcome occlusion and frame flipping problems in the two camera Vicon setup and the drifting problem associated with the IMUs. The results indicated that occlusion and frame flipping problems associated with Vicon can be mitigated by using the IMU measurements. Furthermore, the proposed method improves the Mean Square Error (MSE) tracking accuracy range from 0.8˚ to 6.4˚ compared with the IMU only method. Secondly, to record haptic feedback from a teacher without physically obstructing their interactions with the workpiece, wearable surface electromyography (sEMG) armbands were used as an indirect method to indicate contact feedback during manual manipulations. A muscle-force model using a Time Delayed Neural Network (TDNN) was built to map the sEMG signals to the known contact force. The results indicated that the model was capable of estimating the force from the sEMG armbands in the applications of interest, namely in peg-in-hole and beater winding tasks, with MSE of 2.75N and 0.18N respectively. Finally, given the force estimation and the motion trajectories, a Hidden Markov Model (HMM) based approach was utilised as a state recognition method to encode and generalise the spatial and temporal information of the skilled executions. This method would allow a more representative control policy to be derived. A modified Gaussian Mixture Regression (GMR) method was then applied to enable motions reproduction by using the learned state-action policy. To simplify the validation procedure, instead of using the robot, additional demonstrations from the teacher were used to verify the reproduction performance of the policy, by assuming human teacher and robot learner are physical identical systems. The results confirmed the generalisation capability of the HMM model across a number of demonstrations from different subjects; and the reproduced motions from GMR were acceptable in these additional tests. The proposed methodology provides a framework for producing a state-action model from skilled demonstrations that can be translated into robot kinematics and joint states for the robot to execute. The implication to industry is reduced efforts and time in programming the robots for applications where human skilled performances are required to cope robustly with various uncertainties during tasks execution

    Audio Event Classification for Urban Soundscape Analysis

    Get PDF
    The study of urban soundscapes has gained momentum in recent years as more people become concerned with the level of noise around them and the negative impact this can have on comfort. Monitoring the sounds present in a sonic environment can be a laborious and time–consuming process if performed manually. Therefore, techniques for automated signal identification are gaining importance if soundscapes are to be objectively monitored. This thesis presents a novel approach to feature extraction for the purpose of classifying urban audio events, adding to the library of techniques already established in the field. The research explores how techniques with their origins in the encoding of speech signals can be adapted to represent the complex everyday sounds all around us to allow accurate classification. The analysis methods developed herein are based on the zero–crossings information contained within a signal. Originally developed for the classification of bioacoustic signals, the codebook of Time–Domain Signal Coding (TDSC) has its band–limited restrictions removed to become more generic. Classification using features extracted with the new codebook achieves accuracies of over 80% when combined with a Multilayer Perceptron classifier. Further advancements are made to the standard TDSC algorithm, drawing inspiration from wavelets, resulting in a novel dyadic representation of time–domain features. Carrying the label of Multiscale TDSC (MTDSC), classification accuracies of 70% are achieved using these features. Recommendations for further work focus on expanding the library of training data to improve the accuracy of the classification system. Further research into classifier design is also suggested
    corecore