21 research outputs found

    Human Attention Detection Using AM-FM Representations

    Get PDF
    Human activity detection from digital videos presents many challenges to the computer vision and image processing communities. Recently, many methods have been developed to detect human activities with varying degree of success. Yet, the general human activity detection problem remains very challenging, especially when the methods need to work “in the wild” (e.g., without having precise control over the imaging geometry). The thesis explores phase-based solutions for (i) detecting faces, (ii) back of the heads, (iii) joint detection of faces and back of the heads, and (iv) whether the head is looking to the left or the right, using standard video cameras without any control on the imaging geometry. The proposed phase-based approach is based on the development of simple and robust methods that relie on the use of Amplitude Modulation - Frequency Modulation (AM-FM) models. The approach is validated using video frames extracted from the Advancing Outof- school Learning in Mathematics and Engineering (AOLME) project. The dataset consisted of 13,265 images from ten students looking at the camera, and 6,122 images from five students looking away from the camera. For the students facing the camera, the method was able to correctly classify 97.1% of them looking to the left and 95.9% of them looking to the right. For the students facing the back of the camera, the method was able to correctly classify 87.6% of them looking to the left and 93.3% of them looking to the right. The results indicate that AM-FM based methods hold great promise for analyzing human activity videos

    The Importance of the Instantaneous Phase in Detecting Faces with Convolutional Neural Networks

    Get PDF
    Convolutional Neural Networks (CNN) have provided new and accurate methods for processing digital images and videos. Yet, training CNNs is extremely demanding in terms of computational resources. Also, for simple applications, the standard use of transfer learning also tends to require far more resources than what may be needed. Furthermore, the final systems tend to operate as black boxes that are difficult to interpret. The current thesis considers the problem of detecting faces from the AOLME video dataset. The AOLME dataset consists of a large video collection of group interactions that are recorded in unconstrained classroom environments. For the thesis, still image frames were extracted at every minute from 18 24-minute videos. Then, each video frame was divided into 9x5 blocks with 50x50 pixels each. For each of the 19440 blocks, the percentage of face pixels was set as ground truth. Face detection was then defined as a regression problem for determining the face pixel percentage for each block. For testing different methods, 12 videos were used for training and validation. The remaining 6 videos were used for testing. The thesis examines the impact of using the instantaneous phase for the AOLME block-based face detection application. For comparison, the thesis compares the use of the Frequency modulation image based on the instantaneous phase, the use of the instantaneous amplitude, and the original gray scale image. To generate the FM and AM inputs, the thesis uses dominant component analysis that aims to decrease the training overhead while maintaining interpretability. The results indicate that the use of the FM image yielded about the same performance as the MobileNet V2 architecture (AUC of 0.78 vs 0.79), with vastly reduced training times. Training was 7x faster for an Intel Xeon with a GTX 1080 based desktop and 11x faster on a laptop with Intel i5 with a GTX 1050. Furthermore, the proposed architecture trains 123x less parameters than what is needed for MobileNet V2. The FM-based neural network architecture uses a single convolutional layer. In comparison, the full LeNet-5 on the same image block using the original image could not be trained for face detection (AUC of 0.5)

    The Importance of the Instantaneous Phase in Detecting Faces with Convolutional Neural Networks

    Full text link
    Convolutional Neural Networks (CNN) have provided new and accurate methods for processing digital images and videos. Yet, training CNNs is extremely demanding in terms of computational resources. Also, for specific applications, the standard use of transfer learning also tends to require far more resources than what may be needed. Furthermore, the final systems tend to operate as black boxes that are difficult to interpret. The current thesis considers the problem of detecting faces from the AOLME video dataset. The AOLME dataset consists of a large video collection of group interactions that are recorded in unconstrained classroom environments. For the thesis, still image frames were extracted at every minute from 18 24-minute videos. Then, each video frame was divided into 9x5 blocks with 50x50 pixels each. For each of the 19440 blocks, the percentage of face pixels was set as ground truth. Face detection was then defined as a regression problem for determining the face pixel percentage for each block. For testing different methods, 12 videos were used for training and validation. The remaining 6 videos were used for testing. The thesis examines the impact of using the instantaneous phase for the AOLME block-based face detection application. For comparison, the thesis compares the use of the Frequency Modulation image based on the instantaneous phase, the use of the instantaneous amplitude, and the original gray scale image. To generate the FM and AM inputs, the thesis uses dominant component analysis that aims to decrease the training overhead while maintaining interpretability.Comment: Master Thesi

    Advancements and Breakthroughs in Ultrasound Imaging

    Get PDF
    Ultrasonic imaging is a powerful diagnostic tool available to medical practitioners, engineers and researchers today. Due to the relative safety, and the non-invasive nature, ultrasonic imaging has become one of the most rapidly advancing technologies. These rapid advances are directly related to the parallel advancements in electronics, computing, and transducer technology together with sophisticated signal processing techniques. This book focuses on state of the art developments in ultrasonic imaging applications and underlying technologies presented by leading practitioners and researchers from many parts of the world

    Distributed and Scalable Video Analysis Architecture for Human Activity Recognition Using Cloud Services

    Get PDF
    This thesis proposes an open-source, maintainable system for detecting human activity in large video datasets using scalable hardware architectures. The system is validated by detecting writing and typing activities that were collected as part of the Advancing Out of School Learning in Mathematics and Engineering (AOLME) project. The implementation of the system using Amazon Web Services (AWS) is shown to be both horizontally and vertically scalable. The software associated with the system was designed to be robust so as to facilitate reproducibility and extensibility for future research

    Non-Invasive Hemodynamic Parameters Assessment using Optoelectronic Devices

    Get PDF
    Tese de doutoramento em Engenharia Biomédica, apresentada à Faculdade de Medicina da Universidade de CoimbraA grande incidência das doenças cardiovasculares no mundo estimulou a procura de novas soluções que permitam a deteção precoce de processos patológicos associados a este tipo de doenças. Especial ênfase foi dada a métodos que permitem a monitorização da pressão arterial e da forma de onda de pressão arterial, que fornecem uma ferramenta precisa que complementa o diagnóstico baseado em múltiplos parâmetros. Da análise das características da forma de onda da pressão arterial, e da sua velocidade de propagação, podem ser extraídas importantes parâmetros clínicos de modo a avaliar o risco cardiovascular, a adaptação vascular e a eficácia terapêutica. O uso de múltiplos parâmetros permite minimizar erros na estimação de um dos parâmetros. As soluções emergentes para a monitorização cardiovascular têm-se afastado de tecnologias invasivas e caras para soluções não invasivas e sem contacto. Neste sentido, os sistemas ópticos apresentam uma grande vantagem devido ao grande progresso tecnológico sofrido nas últimas décadas. A natureza de não contacto desta tecnologia permite a medição sem distorção da forma da onda arterial ultrapassando as limitações dos aparelhos comerciais usados para este tipo de avaliação. O principal objetivo deste trabalho consistia em demonstrar que é possível adquirir através do uso de uma metodologia óptica, a forma da onda de pressão arterial sem contacto, com uma configuração que permite medir a velocidade onda de pulso (VOP) local e determinar os principais parâmetros usando algoritmos dedicados. Foram desenvolvidos quatro protótipos: três baseados em luz não-coerente e um em luz coerente. As sondas foram desenvolvidas usando uma configuração comum, composta por dois fotodetectores distanciados de 2 cm, o que garante a deteção da onda de pulso em dois pontos distintos e permite uma determinação rigorosa do tempo de trânsito. Nas sondas de luz não-coerente foram testados três fotodetectores: fotodíodos de avalanche, fotodíodos planares, e fotodíodos de efeito lateral (LEP). Os componentes do sistema óptico (protótipos das sondas e caixa de aquisição) foram desenhados com as características físicas que permitem o uso clínico, como a portabilidade, o tamanho compacto, leves, de baixo consumo e com materiais de baixo custo, ergonómicas para o operador e confortáveis para o paciente, de modo a serem consideradas uma solução interessante para a comercialização. Os testes in vivo permitiram a seleção da melhor combinação sonda/algoritmo para a determinação da PWV, usando o método da correlação e a sonda baseada em fotodíodos planares que demonstrou ser mais eficiente para a aquisição de sinais em humanos. O sistema óptico desenvolvido mostrou boa reprodutibilidade na avaliação inter e intra-operador. Um estudo alargado foi desenvolvido em 131 sujeitos jovens, com um valor médio PWV de 33.33±0.72 ms-1, confirmando o seu aumento com a idade. O teste comparativo entre a onda de distensão medida com o sistema óptico na carótida e o perfil da onda de pressão adquirida invasivamente por um cateter intra-arterial mostrou uma grande correlação entre as duas ondas (valor médio de 0.958), validando a capacidade das sondas ópticas para estimar a forma da onda de pulso de modo não-invasivo e sem contacto. A sonda óptica baseada em luz coerente foi testada em combinação com algoritmos de processamento de sinal baseados nos métodos short time Fourier transform e empirical mode decomposition, demonstrando ser capaz de determinar os pontos característicos da forma de onda com baixo erro (menor que 5ms). Uma configuração alternativa foi testada usando um fotodetector com uma maior área que permitiu obter o efeito de self-mixing fora da cavidade laser. Esta característica abriu a possibilidade de construir uma nova sonda adaptada a esta nova técnica de modo a melhorar a qualidade do sinal e permitir uma aplicação biomédica. Globalmente, os resultados obtidos para a metodologias desenvolvidas (protótipos e ferramentas de processamento de sinal associados) mostraram ser possível de medir a onda de pulso arterial na carótida, para determinar vários parâmetros clínicos e avaliar a condição cardiovascular.The world wide incidence of cardiovascular diseases (CVDs), has spurred the research efforts targeting new solutions that may be able to perform an early detection of the pathological processes associated with these diseases. Special emphasis has been given to the methods that allow the monitoring of the blood pressure and the arterial pulse waveform, thus providing a more precise tool to complement the diagnosis process based on a multi-parameter assessment approach. From the analysis of arterial pulse pressure waveform features, and its propagation velocity, important clinical parameters can be extracted in order to evaluate the CVD risk, the vascular adaptation and the therapeutic efficacy. The use of multiple parameters allows to minimize the error when compared to the approach where a subject is classified solely based on a single parameter. Emerging trends in cardiovascular monitoring are moving away from invasive and costly technologies towards non-invasive and low-cost solutions. In this sense, optical solutions represent a great advantage due to the immense technological progresses observed in the recent decades. The truly non-contact nature of optical techniques allows measurements without distortion in the shape of the pulse curve, which is one of the main limitations of the current commercial devices used in hemodynamic parameters assessment. The main objective of this work consists in demonstrating that with an optical system it is possible to acquire the arterial pulse waveform with a configuration that allows the local pulse wave velocity (PWV) measurement and the determination of the most important clinical parameters using dedicated algorithms, without physical contact with the skin of the patient. Four prototypes were developed: three based in non-coherent light and one with coherent light. All the developed optical probes have a common design structure. They include two identical photodetectors placed 2 cm apart from each other to guarantee accurate determination of local pulse transit time. Relatively to the non-coherent light probes three different probes base on photodetectors were tested: an avalanche photodiode, a planar photodiode and a lateral effect photodiode (LEP). The optical system components (probe prototypes and acquisition box) were designed to meet specific requirements that allow the clinical use, such as portability, compact size and low weight, low cost, limited power consumption, ergonomics and easy user-interface in order to be considered as an interesting solution for commercial purposes. The in vivo tests allowed the selection of the best algorithm and probe combination to determine PWV: cross-correlation algorithm and the probe with planar photodiodes demonstrated to be the most efficient. This system showed good reproducibility, as evaluated by both inter-operator and intra-operator analysis. A large study was performed in 131 young subjects, obtaining a mean value for PWV of 3.33±0.72 ms-1, thus confirming its significant increase with age. A comparative test between the distension waveform measured with the optical probe at the carotid artery and the invasive profile of the pulse pressure acquired by an intra arterial catheter showed a strong correlation (mean value of 0.958), and validates the ability of this non-invasive device to estimate the arterial pulse waveform. Also a coherent light probe was developed and tested using several processing techniques based on the short time Fourier transform and empirical mode decomposition algorithm. This approach demonstrated the ability to determine the main feature points in the waveform with low error in the pulse transit time determination (less than 5ms). An alternative configuration for the Doppler effect-based probe was tested, using a photodetector with a larger area in order to obtain the self-mixing effect outside the laser cavity. This feature opened the possibility to improve the quality of the signal which may foresee potential future biomedical applications. Globally, the results obtained with the developed methodologies (prototypes and associated algorithmic tools) proved that it is possible to measure the arterial pulse waveform in the carotid artery, to determine several clinical parameters and assess the cardiovascular condition with optical technology.Fundação para a Ciência e Tecnologia - SFRH / BD / 79334 / 201

    Frequency Domain Decomposition of Digital Video Containing Multiple Moving Objects

    Get PDF
    Motion estimation has been dominated by time domain methods such as block matching and optical flow. However, these methods have problems with multiple moving objects in the video scene, moving backgrounds, noise, and fractional pixel/frame motion. This dissertation proposes a frequency domain method (FDM) that solves these problems. The methodology introduced here addresses multiple moving objects, with or without a moving background, 3-D frequency domain decomposition of digital video as the sum of locally translational (or, in the case of background, a globally translational motion), with high noise rejection. Additionally, via a version of the chirp-Z, fractional pixel/frame motion detection and quantification is accomplished. Furthermore, images of particular moving objects can be extracted and reconstructed from the frequency domain. Finally, this method can be integrated into a larger system to support motion analysis. The method presented here has been tested with synthetic data, realistic, high fidelity simulations, and actual data from established video archives to verify the claims made for the method, all presented here. In addition, a convincing comparison with an up-and-coming spatial domain method, incremental principal component pursuit (iPCP), is presented, where the FDM performs markedly better than its competition

    Modulation Domain Image Processing

    Get PDF
    The classical Fourier transform is the cornerstone of traditional linearsignal and image processing. The discrete Fourier transform (DFT) and thefast Fourier transform (FFT) in particular led toprofound changes during the later decades of the last century in howwe analyze and process 1D and multi-dimensional signals.The Fourier transform represents a signal as an infinite superpositionof stationary sinusoids each of which has constant amplitude and constantfrequency. However, many important practical signals such as radar returnsand seismic waves are inherently nonstationary. Hence, more complextechniques such as the windowed Fourier transform and the wavelet transformwere invented to better capture nonstationary properties of these signals.In this dissertation, I studied an alternative nonstationary representationfor images, the 2D AM-FM model. In contrast to thestationary nature of the classical Fourier representation, the AM-FM modelrepresents an image as a finite sum of smoothly varying amplitudesand smoothly varying frequencies. The model has been applied successfullyin image processing applications such as image segmentation, texture analysis,and target tracking. However, these applications are limitedto \emph{analysis}, meaning that the computed AM and FM functionsare used as features for signal processing tasks such as classificationand recognition. For synthesis applications, few attempts have been madeto synthesize the original image from the AM and FM components. Nevertheless,these attempts were unstable and the synthesized results contained artifacts.The main reason is that the perfect reconstruction AM-FM image model waseither unavailable or unstable. Here, I constructed the first functionalperfect reconstruction AM-FM image transform that paves the way for AM-FMimage synthesis applications. The transform enables intuitive nonlinearimage filter designs in the modulation domain. I showed that these filtersprovide important advantages relative to traditional linear translation invariant filters.This dissertation addresses image processing operations in the nonlinearnonstationary modulation domain. In the modulation domain, an image is modeledas a sum of nonstationary amplitude modulation (AM) functions andnonstationary frequency modulation (FM) functions. I developeda theoretical framework for high fidelity signal and image modeling in themodulation domain, constructed an invertible multi-dimensional AM-FMtransform (xAMFM), and investigated practical signal processing applicationsof the transform. After developing the xAMFM, I investigated new imageprocessing operations that apply directly to the transformed AM and FMfunctions in the modulation domain. In addition, I introduced twoclasses of modulation domain image filters. These filters produceperceptually motivated signal processing results that are difficult orimpossible to obtain with traditional linear processing or spatial domainnonlinear approaches. Finally, I proposed three extensions of the AM-FMtransform and applied them in image analysis applications.The main original contributions of this dissertation include the following.- I proposed a perfect reconstruction FM algorithm. I used aleast-squares approach to recover the phase signal from itsgradient. In order to allow perfect reconstruction of the phase function, Ienforced an initial condition on the reconstructed phase. The perfectreconstruction FM algorithm plays a critical role in theoverall AM-FM transform.- I constructed a perfect reconstruction multi-dimensional filterbankby modifying the classical steerable pyramid. This modified filterbankensures a true multi-scale multi-orientation signal decomposition. Such adecomposition is required for a perceptually meaningful AM-FM imagerepresentation.- I rotated the partial Hilbert transform to alleviate ripplingartifacts in the computed AM and FM functions. This adjustment results inartifact free filtering results in the modulation domain.- I proposed the modulation domain image filtering framework. Iconstructed two classes of modulation domain filters. I showed that themodulation domain filters outperform traditional linear shiftinvariant (LSI) filters qualitatively and quantitatively in applicationssuch as selective orientation filtering, selective frequency filtering,and fundamental geometric image transformations.- I provided extensions of the AM-FM transform for image decompositionproblems. I illustrated that the AM-FM approach can successfullydecompose an image into coherent components such as textureand structural components.- I investigated the relationship between the two prominentAM-FM computational models, namely the partial Hilbert transformapproach (pHT) and the monogenic signal. The established relationshiphelps unify these two AM-FM algorithms.This dissertation lays a theoretical foundation for future nonlinearmodulation domain image processing applications. For the first time, onecan apply modulation domain filters to images to obtain predictableresults. The design of modulation domain filters is intuitive and simple,yet these filters produce superior results compared to those of pixeldomain LSI filters. Moreover, this dissertation opens up other research problems.For instance, classical image applications such as image segmentation andedge detection can be re-formulated in the modulation domain setting.Modulation domain based perceptual image and video quality assessment andimage compression are important future application areas for the fundamentalrepresentation results developed in this dissertation

    Molecular Imaging

    Get PDF
    The present book gives an exceptional overview of molecular imaging. Practical approach represents the red thread through the whole book, covering at the same time detailed background information that goes very deep into molecular as well as cellular level. Ideas how molecular imaging will develop in the near future present a special delicacy. This should be of special interest as the contributors are members of leading research groups from all over the world
    corecore