7 research outputs found

    Spontaneous facial expression analysis using optical flow

    Full text link
    © 2017 IEEE. Investigation of emotions manifested through facial expressions has valuable applications in predictive behavioural studies. This has piqued interest towards developing intelligent visual surveillance using facial expression analysis coupled with Closed Circuit Television (CCTV). However, a facial recognition program tailored to evaluating facial behaviour for forensic and security purposes can be met if patterns of emotions in general can be detected. The present study assesses whether emotional expression derived from frontal or profile views of the face can be used to determine differences between three emotions: Amusement, Sadness and Fear using the optical flow technique. Analysis was in the form of emotion maps constructed from feature vectors obtained from using the Lucas-Kanade implementation of optical flow. These feature vectors were selected as inputs for classification. It was anticipated that the findings would assist in improving the optical flow algorithm for feature extraction. However, further data analyses are necessary to confirm if different types of emotion can be identified clearly using optical flow or other such techniques

    Image tag completion by local learning

    Full text link
    The problem of tag completion is to learn the missing tags of an image. In this paper, we propose to learn a tag scoring vector for each image by local linear learning. A local linear function is used in the neighborhood of each image to predict the tag scoring vectors of its neighboring images. We construct a unified objective function for the learning of both tag scoring vectors and local linear function parame- ters. In the objective, we impose the learned tag scoring vectors to be consistent with the known associations to the tags of each image, and also minimize the prediction error of each local linear function, while reducing the complexity of each local function. The objective function is optimized by an alternate optimization strategy and gradient descent methods in an iterative algorithm. We compare the proposed algorithm against different state-of-the-art tag completion methods, and the results show its advantages

    Supervised cross-modal factor analysis for multiple modal data classification

    Full text link
    In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods

    Moving object detection for automobiles by the shared use of H.264/AVC motion vectors : innovation report.

    Get PDF
    Cost is one of the problems for wider adoption of Advanced Driver Assistance Systems (ADAS) in China. The objective of this research project is to develop a low-cost ADAS by the shared use of motion vectors (MVs) from a H.264/AVC video encoder that was originally designed for video recording only. There were few studies on the use of MVs from video encoders on a moving platform for moving object detection. The main contribution of this research is the novel algorithm proposed to address the problems of moving object detection when MVs from a H.264/AVC encoder are used. It is suitable for mass-produced in-vehicle devices as it combines with MV based moving object detection in order to reduce the cost and complexity of the system, and provides the recording function by default without extra cost. The estimated cost of the proposed system is 50% lower than that making use of the optical flow approach. To reduce the area of region of interest and to account for the real-time computation requirement, a new block based region growth algorithm is used for the road region detection. To account for the small amplitude and limited precision of H.264/AVC MVs on relatively slow moving objects, the detection task separates the region of interest into relatively fast and relatively slow speed regions by examining the amplitude of MVs, the position of focus of expansion and the result of road region detection. Relatively slow moving objects are detected and tracked by the use of generic horizontal and vertical contours of rear-view vehicles. This method has addressed the problem of H.264/AVC encoders that possess limited precision and erroneous motion vectors for relatively slow moving objects and regions near the focus of expansion. Relatively fast moving objects are detected by a two-stage approach. It includes a Hypothesis Generation (HG) and a Hypothesis Verification (HV) stage. This approach addresses the problem that the H.264/AVC MVs are generated for coding efficiency rather than for minimising motion error of objects. The HG stage will report a potential moving object based on clustering the planar parallax residuals satisfying the constraints set out in the algorithm. The HV will verify the existence of the moving object based on the temporal consistency of its displacement in successive frames. The test results show that the vehicle detection rate higher than 90% which is on a par to methods proposed by other authors, and the computation cost is low enough to achieve the real-time performance requirement. An invention patent, one international journal paper and two international conference papers have been either published or accepted, showing the originality of the work in this project. One international journal paper is also under preparation

    Hardware acceleration using FPGAs for adaptive radiotherapy

    Get PDF
    Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm.Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm
    corecore