1,429 research outputs found

    Joint optimization of manifold learning and sparse representations for face and gesture analysis

    Get PDF
    Face and gesture understanding algorithms are powerful enablers in intelligent vision systems for surveillance, security, entertainment, and smart spaces. In the future, complex networks of sensors and cameras may disperse directions to lost tourists, perform directory lookups in the office lobby, or contact the proper authorities in case of an emergency. To be effective, these systems will need to embrace human subtleties while interacting with people in their natural conditions. Computer vision and machine learning techniques have recently become adept at solving face and gesture tasks using posed datasets in controlled conditions. However, spontaneous human behavior under unconstrained conditions, or in the wild, is more complex and is subject to considerable variability from one person to the next. Uncontrolled conditions such as lighting, resolution, noise, occlusions, pose, and temporal variations complicate the matter further. This thesis advances the field of face and gesture analysis by introducing a new machine learning framework based upon dimensionality reduction and sparse representations that is shown to be robust in posed as well as natural conditions. Dimensionality reduction methods take complex objects, such as facial images, and attempt to learn lower dimensional representations embedded in the higher dimensional data. These alternate feature spaces are computationally more efficient and often more discriminative. The performance of various dimensionality reduction methods on geometric and appearance based facial attributes are studied leading to robust facial pose and expression recognition models. The parsimonious nature of sparse representations (SR) has successfully been exploited for the development of highly accurate classifiers for various applications. Despite the successes of SR techniques, large dictionaries and high dimensional data can make these classifiers computationally demanding. Further, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where for example variations in pose may affect identity and expression recognition. This thesis analyzes the interaction between dimensionality reduction and sparse representations to present a unified sparse representation classification framework that addresses both issues of computational complexity and coefficient contamination. Semi-supervised dimensionality reduction is shown to mitigate the coefficient contamination problems associated with SR classifiers. The combination of semi-supervised dimensionality reduction with SR systems forms the cornerstone for a new face and gesture framework called Manifold based Sparse Representations (MSR). MSR is shown to deliver state-of-the-art facial understanding capabilities. To demonstrate the applicability of MSR to new domains, MSR is expanded to include temporal dynamics. The joint optimization of dimensionality reduction and SRs for classification purposes is a relatively new field. The combination of both concepts into a single objective function produce a relation that is neither convex, nor directly solvable. This thesis studies this problem to introduce a new jointly optimized framework. This framework, termed LGE-KSVD, utilizes variants of Linear extension of Graph Embedding (LGE) along with modified K-SVD dictionary learning to jointly learn the dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier. By injecting LGE concepts directly into the K-SVD learning procedure, this research removes the support constraints K-SVD imparts on dictionary element discovery. Results are shown for facial recognition, facial expression recognition, human activity analysis, and with the addition of a concept called active difference signatures, delivers robust gesture recognition from Kinect or similar depth cameras

    Structured video coding

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1991.Includes bibliographical references (leaves 67-71).by Patrick Campbell McLean.M.S

    Digital Image Processing

    Get PDF
    This book presents several recent advances that are related or fall under the umbrella of 'digital image processing', with the purpose of providing an insight into the possibilities offered by digital image processing algorithms in various fields. The presented mathematical algorithms are accompanied by graphical representations and illustrative examples for an enhanced readability. The chapters are written in a manner that allows even a reader with basic experience and knowledge in the digital image processing field to properly understand the presented algorithms. Concurrently, the structure of the information in this book is such that fellow scientists will be able to use it to push the development of the presented subjects even further

    A vector light sensor for 3D proximity applications: Designs, materials, and applications

    Get PDF
    In this thesis, a three-dimensional design of a vector light sensor for angular proximity detection applications is realized. 3D printed mesa pyramid designs, along with commercial photodiodes, were used as a prototype for the experimental verification of single-pixel and two-pixel systems. The operation principles, microfabrication details, and experimental verification of micro-sized mesa and CMOS-compatible inverse vector light pixels in silicon are presented, where p-n junctions are created on pyramid’s facets as photodiodes. The one-pixel system allows for angular estimations, providing spatial proximity of incident light in 2D and 3D. A two-pixel system was further demonstrated to have a wider-angle detection. Multilayered carbon nanotubes, graphene, and vanadium oxide thin films as well as carbon nanoparticles-based composites were studied along with cost effective deposition processes to incorporate these films onto 3D mesa structures. Combining such design and materials optimizations produces sensors with a unique design, simple fabrication process, and readout integrated circuits’ compatibility. Finally, an approach to utilize such sensors in smart energy system applications as solar trackers, for automated power generation optimizations, is explored. However, integration optimizations in complementary-Si PV solar modules were first required. In this multi-step approach, custom composite materials are utilized to significantly enhance the reliability in bifacial silicon PV solar modules. Thermal measurements and process optimizations in the development of imec’s novel interconnection technology in solar applications are discussed. The interconnection technology is used to improve solar modules’ performance and enhance the connectivity between modules’ cells and components. This essential precursor allows for the effective powering and consistent operations of standalone module-associated components, such as the solar tracker and Internet of Things sensing devices, typically used in remote monitoring of modules’ performance or smart energy systems. Such integrations and optimizations in the interconnection technology improve solar modules’ performance and reliability, while further reducing materials and production costs. Such advantages further promote solar (Si) PV as a continuously evolving renewable energy source that is compatible with new waves of smart city technology and systems

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Blood flow measurement in the zebrafish heart using light sheet microscopy

    Get PDF
    The link between haemodynamics and cardiac tissue mechanics is an active area of research in developmental biology. Nevertheless, previous study of fluid-structure interaction in the developing heart was mostly confined to single projection blood flow measurements or computational fluid dynamics simulations using only the information of the heart wall structure. Hence, techniques capable of direct 3D + time resolved blood flow and heart wall motion are necessary to deepen the understanding of the cardiac function in the developing heart. This work presents an imaging system which combines selective plane illumination microscopy (SPIM), with optical gating techniques, and micro particle image velocimetry (uPIV). This combination (referred here as SPIM-uPIV) allowed non-invasively measuring blood flow in the developing zebrafish heart in a depth and time-resolved manner. Our system surpasses conventional uPIV measurement systems based on wide-field illumination which suffer from measurement errors due to volume illumination of the sample. The proposed SPIM-uPIV system was validated in a control microfluidics experiment where flow of fluorescent microspheres was measured in a 50 um diameter tube. Both qualitative and quantitative analysis was performed to compare our SPIM-uPIV against conventional brightfield-uPIV measurements. Furthermore, this work implements a different metric for “cross-correlation” which was empirically found to perform better than the traditional algorithm used in PIV analysis, when motion of large particles is measured. By implementing optical gating techniques to our analysis, 3D + time blood flow measurements in the beating hearts of 3, 4, and 5 day old zebrafish hearts were obtained. The recovered 3D + time velocity information enabled further investigation of the heart function such as the pumping efficiency which was obtained by calculating the flow rate through a section of a heart chamber. In summary, it is proposed that SPIM-uPIV system can be a useful tool for direct bloodflow measurements in transparent small-animal models. Such measurements would benefit the current knowledge of fluid-structure interaction phenomena in the developing heart, and could be used to validate previous work by other group

    Face Image and Video Analysis in Biometrics and Health Applications

    Get PDF
    Computer Vision (CV) enables computers and systems to derive meaningful information from acquired visual inputs, such as images and videos, and make decisions based on the extracted information. Its goal is to acquire, process, analyze, and understand the information by developing a theoretical and algorithmic model. Biometrics are distinctive and measurable human characteristics used to label or describe individuals by combining computer vision with knowledge of human physiology (e.g., face, iris, fingerprint) and behavior (e.g., gait, gaze, voice). Face is one of the most informative biometric traits. Many studies have investigated the human face from the perspectives of various different disciplines, ranging from computer vision, deep learning, to neuroscience and biometrics. In this work, we analyze the face characteristics from digital images and videos in the areas of morphing attack and defense, and autism diagnosis. For face morphing attacks generation, we proposed a transformer based generative adversarial network to generate more visually realistic morphing attacks by combining different losses, such as face matching distance, facial landmark based loss, perceptual loss and pixel-wise mean square error. In face morphing attack detection study, we designed a fusion-based few-shot learning (FSL) method to learn discriminative features from face images for few-shot morphing attack detection (FS-MAD), and extend the current binary detection into multiclass classification, namely, few-shot morphing attack fingerprinting (FS-MAF). In the autism diagnosis study, we developed a discriminative few shot learning method to analyze hour-long video data and explored the fusion of facial dynamics for facial trait classification of autism spectrum disorder (ASD) in three severity levels. The results show outstanding performance of the proposed fusion-based few-shot framework on the dataset. Besides, we further explored the possibility of performing face micro- expression spotting and feature analysis on autism video data to classify ASD and control groups. The results indicate the effectiveness of subtle facial expression changes on autism diagnosis
    • …
    corecore